Dataset Tools Rework #749

O-J1 · 2025-03-19T15:13:03Z

Given the ongoing complaints/requests about captioning, I decided it was time for a comprehensive (and final) overhaul. The PR is currently marked as a draft since I fully expect Johnny to point out numerous issues or questionable decisions that deserve attention. To users reading this; OT will never be as focused as other dedicated tools, this is at the limit.

Additions:

Visual redesign
New captioning models added: Moondream2, WD EVA02 v3, WD Swinv2 v3, and JoyTag. Removed Blip1 due to it being garbage. Consider upgrading Blip2 to Blip3 once released, if feasible.
Integrated Moondream+SAM2.1 for a grounded SAM style approach. Moondream performed notably better for object detection in my tests (though this is somewhat subjective). Additionally, its repository was significantly easier to work with compared to GS.
Added common and practical image operations
Implemented useful caption management operations
The window now properly resizes, including img
Added functionality including undo, redo, clear current caption, save button, and corresponding shortcuts, along with additional shortcut improvement.
Multi-line caption support added (fixes previous issues with losing multi-line input)
Fixed bugs related to samples handling
File list now includes filtering capabilities
Enabled opening of the file browser directly at the current directory for all platforms (previously Windows-only)
Adds JXL support with a Pillow plugin (rust) as the PIL team does not seem to be moving to support it anytime soon and it offers lossless JPEG transcoding and significant space savings.
A cursor indicating whether you have brush or fill on!

Initially, this PR was also supposed to include a samples rework, but the effort involved was beyond my expectations just reaching the current stage. I've self-reviewed it as best I can, but after looking at it for so long, I'm certain Ive become blind to some things.

If someone knows a well tested, lightweight ish photoreal replacement for Blip2, then I am open to outright replacing it but you have to provide lots of examples (preferably a peer reviewed paper)

P.S After this update, aside from major model improvements or truly groundbreaking developments (not incremental tweaks), I personally won't be addressing further data tool requests—and based on Nero's recent comments, I doubt he will either.

… sampling logic more defensive

…l screens

…or Caption model too. (To work more reliably)

…on path

…nto appropriate classes.

…fy progress constants and make window taller.

…ject.toml to make it installable avoiding sys.path hack

- add BulkCaptionEdit tests

… 16%~ to 5%)

- Make MaksByColor.py go from 2.0s/it to 1.2s ish - fix SAMdream masking regression

…on and mask.

O-J1 · 2025-06-30T12:37:03Z

Its absolutely not perfect but it works satisfactorily now. Marking ready for review. Many changes and refactors will probably have to happen but I am committed to this being merged at some point.

modules/module/captioning/caption_config_types.py

requirements-global.txt

modules/ui/BulkCaptionEditWindow.py

O-J1 added 30 commits March 5, 2025 20:12

Reworks dataset tools. Icons, better buttons sizes and lablel tweks +…

cdbfc77

… sampling logic more defensive

Constants, multi-line caption support. Default size more usable on al…

ee49746

…l screens

Significant refactor, slightly higher res + new icons

41d5511

Enable icons.py to work with pallete pngs + further UI tweaks.

d5bba22

Further tweaks but now broken image scaling.

bf1c2d7

Working classes for MaskEditor, mostly. Image still bricked.

bfa77e8

Still broken image scaling

7820eb3

Fixed image and masking, keybinds and more

0302243

Better types, switched to pathlib, only save caption and mask if change

6e96bf3

Fix accidental brush logic inversion

97f0155

Renamed a few functions.

01aab67

Doc string

f6599ff

Tweaks to CaptionUI, refactors generate windows, modified fill mode f…

5cb8d61

…or Caption model too. (To work more reliably)

Initial Rating & char thresholds, Mcut, blacklists + refactoring

084a208

Blacklist

bdd0829

Slightly narrower window

8bfb84e

Add License for Lucide icons.

0177c5f

Nuke Blip, prepare to add JT and Moondream, adjust readme

1d7eb49

No attempts to silence libvips worked, reverting.

2156707

Deleted logging_util.py

a2fd62a

Remove debug, typo fix.

97be50e

More refactors and reorganising, cursors and such.

9a81f05

Move and fix filtering tokens out of result

45c705e

Finalise icons.py

938c792

Finalise BM.py

388176f

Add CPUExecutionProvider to stop warning

eeced18

Refine Moondream2Model.py

905c5fd

Refine SAM+Moondream segmenting model

1a04008

refine models and utils

d504545

Refine components.py

59b7d17

O-J1 added 16 commits June 25, 2025 03:35

Ensure GMW yields to CaptionUI if it has a valid path, else use sessi…

2606fdc

…on path

Update moondream version, add moondream tag-lish mode

ea0cac9

Add dpid for reduction factors above 3 and split out FileOperations i…

fff3d85

…nto appropriate classes.

FilerOperationsWindow: Add run_parallel_task, remove redundancy, modi…

1e94bb2

…fy progress constants and make window taller.

Revert install.bat to upstream version now that its been merged

1aeece7

Merge branch 'master' into dataset-and-samples-rework

7034d21

Add tests, fix clear focus bug, fix FileOps edge cases, adjusts pypro…

f2da83f

…ject.toml to make it installable avoiding sys.path hack

Reorganise BulkCaptionEdit logic, fix silent fail if no dir open

f0c7a62

- move shared functions into helpers.py

0248f4e

- add BulkCaptionEdit tests

Add error box when clicking open in explorer button, tweaked logging

3397c24

Hide scrollbar on load/blank and reduce CPU load from scrolling (from…

eb950f4

… 16%~ to 5%)

- Cleaning up GenerateMasksWindow

0084149

- Make MaksByColor.py go from 2.0s/it to 1.2s ish - fix SAMdream masking regression

Moved GenerateMasksWindow to MVC + add tests. Addeds msgbox for capti…

b5c87d2

…on and mask.

Make the filelist update when maximising too

3f3fc1c

Debounced the on_window_resize update to avoid repeatedly firing

7030940

Remove unrequired dep for dpid.

801e926

O-J1 marked this pull request as ready for review June 30, 2025 12:37

O-J1 added 3 commits July 1, 2025 18:15

unload all models when captionUI is closed

1731b38

Modifylogging in the unload function

9577fa5

Use existing torch_gc

b0dd6cc

Nerogar requested changes Jul 13, 2025

View reviewed changes

Apply requested changes from review

c6f9d19

O-J1 added the Effort: High label Sep 4, 2025

O-J1 linked an issue Oct 12, 2025 that may be closed by this pull request

[Bug]: Additional lines in a caption text get deleted when creating a mask. #618

Open

O-J1 added this to the Maxwell/Pascal sunset milestone Oct 14, 2025

dxqb marked this pull request as draft October 15, 2025 04:03

O-J1 mentioned this pull request Oct 21, 2025

[Feat]: Replace CustomTkInter with PyQt6 #584

Open

dxqb linked an issue Oct 24, 2025 that may be closed by this pull request

[Feat]: JXL support #1040

Open

HashakGik mentioned this pull request Nov 22, 2025

UI entirely rewritten in QT6 (PySide6 bindings) #1164

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Dataset Tools Rework #749

Dataset Tools Rework #749

Uh oh!

O-J1 commented Mar 19, 2025 •

edited

Loading

Uh oh!

O-J1 commented Jun 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Dataset Tools Rework #749

Are you sure you want to change the base?

Dataset Tools Rework #749

Uh oh!

Conversation

O-J1 commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Additions:

Uh oh!

O-J1 commented Jun 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

O-J1 commented Mar 19, 2025 •

edited

Loading