About
I tend to spend alot of time with my datasets hopping between different tools and scripts to condition it over multiple iterations to a point where I think its ready to train on.
the scripts and tools i use xD:

so I thought why not make a project that has all the capabilities of the scripts and tools i use into a single interface that focuses on quick editing, and that was main objective for the last week!
I've planned from 3 major modules :
dataset formulation
Image processing
Tags Manager

Log Update 19/12/2025
I added more support for other tagging models, will be proving links for them later but not the actual model files as some of them are 1gb large.
still i like being able to do additive auto-tagging passes to pick up more captions if needed .

Log Update 16/12/2025
while it wansnt originally within my plans to do do but i ended up adding ONNIX tagging to the toolset is was a headache to understand a first but i managed to get it up and working properly in the end.


for now i hardcoded it to use "SmilingWolf/wd-vit-tagger-v3" cause tbh that model was always working like a charm for me for almost two years now, so that's basically all im gonna need so far.
I also started working on the "image processing" module here's what it looks like so far

Log Update 12/12/2025
K so I cleaned the over all project structure to start prepping for the next module "Image processing" I added:
"auto-filtering" that auto filters image as soon as tags are clicked when the Filter is active
"shift + click selection" for the global tags section
invert selection button
"toggle filter hotkey" by pressing F


also added Themes and a Settings page to set it in

Log Update 08/12/2025
I added more hotkeys for quicker manipulation, basically i made so the whole dataset can be manipulated with the arrow keys , i also removed replaced the "multi select" toggle with a simple Ctrl+Click now it feels much better to use

A this more of a experimental feature to test how flexible my dataset pipeline is, i loaded images directly from Gelbooru using the "_g" cmd + posts ids in the directory input, and it loaded them as tagged dataset that i can edit then save later in img+txt format

might add other api commands later if deemed useful
Log Update 06/12/2025
I added more dataset loading options beyond the known (img/txt) fromat in the from of fallback layer the are easily toggleable

the options so far are:
metadata: get captions from the embedded positive prompt inside the image if it exists
filename: uses the sub-directory nesting as tags (i use this alot)
empty: load images without captions as entries
all of there options are auto layered by priority if toggled in this orders (txt file, metadata, file name, empty)
... and yup it worked xDD.. take a look

as simple as putting the directory for my some of my gens, now i can simply work on it and then exported in a (img/txt) format .
might also add an exporting option as embedded captions with the images themselves for sake of archiving.
Tag manager
so far I've manage to make "Tag Manager" module, its basically my own spin of the "sd-tagtool" (one my most used tools by far) .
here's what the "Tag Manager" module looks like so far

as you might have noticed i used the same layout as sd-tagtool as I've come get used to seeing it but added some features that always liked to have such as:
tag tag counting and sorting
replaced order number with:
push to first
push to last
swap tag order
tag shuffling
dark mode ☕
added an image view (zoomed view)
delete image quick action
added hotkeys
and you might notice some the tags has icons, yes those are tag groups (editable txt files) that can help with quick selection at glance, helpful when training styles or character loras .
and I made so that I can view them better with coloring option in "ViewOptions"


I can also swap between color modes on the fly to better visualize my data tags

this is "heat color mode" based on the tag frequency tire

and this is the "user color mode" where it shows custom picked colors
picking colors is as easy as selecting the tags and choosing one from the color menu , or even better by pressing a number key from [1] to [9] , or [0] for clearing it
now here's the most exiting part yet ! I made it so that the colors could be used for quick selection in bulk

there's also other quick selection options like by tag groups or tag frequency

selected tags then can used for any sort action like [insertion, withdrawal, total removal, reorder, coloring ...]
now here's the tag creation dialog, so far its a basic one with no autocomplete yet

Img viewer
a simple image viewer to better see the selected data

Filtering
filtering is simple as toggling the filter on/off with tags selected

by default the filter does not hide image until the option "hide filtered images" is toggled

this is the filter with "hide filtered images" OFF

this is the filter with "hide filtered images" ON, both can be useful depending on the use case
also here are the hotkeys (so far)

finally the export options

the default "export" download the dataset as Zip and keep things non-destructive , but there's also an option to apply changes directly to work directory also useful for doing more work on the dataset.
also note that "load" button actually copies the dataset files to the "work directory" to be accessible by the project.
so yeah I'm just so proud of this project that i wanted to share my progress on it 💚






