Flux LoRA Dataset Preparation Tool

Prepare image datasets for Flux LoRA training with easy AI captioning, PNG conversion, and flexible output options.

Features

Convert JPG/JPEG images to PNG format
AI-powered captioning (using BLIP or filename)
Save captions in .txt and/or .json formats
Simple CLI (prepare_flux_lora_dataset.py)
Modern GUI (flux_lora_dataset_gui.pyw) — with Pause, Resume, and Stop!

Installation

Install the required dependencies:

pip install -r requirements.txt

Usage

1. Command-Line (CLI)

Process all images in a directory using BLIP captioning:

python prepare_flux_lora_dataset.py input_directory output_directory

Options

--tagger {blip,simple}: Choose caption generator
- blip: Uses BLIP (AI; recommended)
- simple: Uses filenames (fast, no AI)
--format {txt,json,both}: Caption output format
- txt: .txt files (default for most trainers)
- json: .json files
- both: Both
--no-convert: Skip PNG conversion (keep originals)
--device {auto,cuda,cpu}: Device for BLIP
- auto (default): Use GPU if available, else CPU
- cuda: Force GPU
- cpu: Force CPU

Examples

Basic BLIP (both formats):

python prepare_flux_lora_dataset.py ./images ./processed --tagger blip --format both

TXT only:

python prepare_flux_lora_dataset.py ./images ./processed --format txt

No AI (filename tags):

python prepare_flux_lora_dataset.py ./images ./processed --tagger simple

Don't convert PNGs:

python prepare_flux_lora_dataset.py ./images ./processed --no-convert

2. Graphical User Interface (GUI)

The GUI makes dataset preparation easy and interactive. You can pause, resume, or stop processing — ideal for big folders!

Start the GUI

python flux_lora_dataset_gui.pyw

Key Features

Folder selection with file browser
Choose tagger and output format
Optional PNG conversion
Live log and visual progress bar
Pause/Resume/Stop controls
Error reporting and summaries when done

Output Format

TXT Files

Each image gets a .txt file with the caption:

image001.png
image001.txt  (contains: "a beautiful landscape with mountains and trees")

JSON Files

Each image gets a .json file with:

{
  "image": "image001.png",
  "caption": "a beautiful landscape with mountains and trees",
  "source": "image001.jpg"
}

Notes & Tips

Supports .jpg, .jpeg, and .png files.
BLIP model downloads automatically (~990MB, first run).
Use GPU (cuda) for much faster BLIP processing.
Large datasets may take time; use the GUI for easier management.

Troubleshooting

Out of memory?

Use --device cpu or select CPU in the GUI.
Split your dataset into smaller batches.

BLIP doesn't work?

Ensure you have transformers and torch:
pip install transformers torch
Check for enough disk space for the model.

Image conversion errors?

Make sure your images are valid (not corrupted).
Check permissions for the input/output folders.

Both CLI and GUI work cross-platform. For best results, use the GUI for large or complex batches!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
flux_lora_dataset_gui.pyw		flux_lora_dataset_gui.pyw
prepare_flux_lora_dataset.py		prepare_flux_lora_dataset.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Flux LoRA Dataset Preparation Tool

Features

Installation

Usage

1. Command-Line (CLI)

Options

Examples

2. Graphical User Interface (GUI)

Start the GUI

Key Features

Output Format

TXT Files

JSON Files

Notes & Tips

Troubleshooting

About

Uh oh!

Releases

Packages

Languages

License

timfox/flux_lora_dataset_tool

Folders and files

Latest commit

History

Repository files navigation

Flux LoRA Dataset Preparation Tool

Features

Installation

Usage

1. Command-Line (CLI)

Options

Examples

2. Graphical User Interface (GUI)

Start the GUI

Key Features

Output Format

TXT Files

JSON Files

Notes & Tips

Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages