- Download the latest release from here
- Right click on the downloaded file and select
RunorOpen - Wait for the application to start
(Windows excutable gives false positive on some antivirus software, you can build the executable yourself if you don't trust the pre-built one)
- Create a virtual environment and activate it
python3 -m venv venvfor windows
./venv/scripts/activatefor macos and linux
source venv/bin/activate- Install the requirements
pip install -r requirements.txt- Run the application
python gui.pyIt may take a while to initialize the application for the first time.
- Copy an image to your clipboard or select a file
- Click on the
load image from clipboardorload image from filebutton - Click on the
analyze imagebutton or press a keybinding. - The tags will be displayed in a new window (First time will take a while to download pre-trained model)
- You can copy the tags to your clipboard by clicking on the
copy tags to clipboardbutton
Extra:
- Check
Unload model after every analysiscan save you some memory, but it will take longer to analyze the image - You can choose tag format, currently support
BooruandStable Diffusionformat
After the first run, a config.ini file will be created in the same directory as the script. You can change the configuration there.
[GUI]
shortcut = Ctrl+Shift+I
unload_model_when_done = False
tag_format = booru
[Tagger]
model = wd-swinv2-v3
threshold = 0.35Default model is wd-swinv2-v3 and I also recommend these models:
wd-swinv2-v3(default, with overall good performance)wd-convnext-v3(might deals rotated images better than other models)wd-vit-v3(good at character recognition)wd14-moat-v2(Incase you want to use the old model)
Default confidence threshold is 0.35, lower it if you want more tags (less accurate).
- User from china mainland might have trouble downloading the model from huggingface
- macOS keybinding works by excute the script in IDEs (e.g. PyCharm or VSCode), but not in terminal. And it needs you to trust the IDE in
System Preferences -> Security & Privacy -> Privacy -> Input Monitoring(Not a safe practice, use at your own risk) - switch keybinding through GUI crahes on macOS (not sure why)
Original code by https://github.com/picobyte/stable-diffusion-webui-wd14-tagger
Public domain, except borrowed parts (e.g. dbimutils.py)
