Sign In

Making my own dataset toolset ✂

17

Dec 4, 2025

(Updated: 2 days ago)

announcement
Making my own dataset toolset ✂

About

I tend to spend alot of time with my datasets hopping between different tools and scripts to condition it over multiple iterations to a point where I think its ready to train on.

the scripts and tools i use xD:

image.png

so I thought why not make a project that has all the capabilities of the scripts and tools i use into a single interface that focuses on quick editing, and that was main objective for the last week!

I've planned from 3 major modules :

  • dataset formulation

  • Image processing

  • Tags Manager

Screenshot 2025-12-04 211252.jpg

Log Update 19/12/2025

I added more support for other tagging models, will be proving links for them later but not the actual model files as some of them are 1gb large.

still i like being able to do additive auto-tagging passes to pick up more captions if needed .

image.png

Log Update 16/12/2025

while it wansnt originally within my plans to do do but i ended up adding ONNIX tagging to the toolset is was a headache to understand a first but i managed to get it up and working properly in the end.

Screenshot 2025-12-16 174715.jpg

image.png

for now i hardcoded it to use "SmilingWolf/wd-vit-tagger-v3" cause tbh that model was always working like a charm for me for almost two years now, so that's basically all im gonna need so far.

I also started working on the "image processing" module here's what it looks like so far

Screenshot 2025-12-16 at 18-02-21 NV Toolset.png


Log Update 12/12/2025

K so I cleaned the over all project structure to start prepping for the next module "Image processing" I added:

  • "auto-filtering" that auto filters image as soon as tags are clicked when the Filter is active

  • "shift + click selection" for the global tags section

  • invert selection button

  • "toggle filter hotkey" by pressing F

Screenshot 2025-12-12 202538.jpg

Screenshot 2025-12-12 171007.jpg

also added Themes and a Settings page to set it in

Screenshot 2025-12-12 171115.jpg


Log Update 08/12/2025

I added more hotkeys for quicker manipulation, basically i made so the whole dataset can be manipulated with the arrow keys , i also removed replaced the "multi select" toggle with a simple Ctrl+Click now it feels much better to use

Screenshot 2025-12-08 153508.jpg

A this more of a experimental feature to test how flexible my dataset pipeline is, i loaded images directly from Gelbooru using the "_g" cmd + posts ids in the directory input, and it loaded them as tagged dataset that i can edit then save later in img+txt format

Screenshot 2025-12-08 at 15-09-56 NV Toolset.png

might add other api commands later if deemed useful


Log Update 06/12/2025

I added more dataset loading options beyond the known (img/txt) fromat in the from of fallback layer the are easily toggleable

Screenshot 2025-12-06 191239.jpg

the options so far are:

  • metadata: get captions from the embedded positive prompt inside the image if it exists

  • filename: uses the sub-directory nesting as tags (i use this alot)

  • empty: load images without captions as entries

all of there options are auto layered by priority if toggled in this orders (txt file, metadata, file name, empty)

... and yup it worked xDD.. take a look

Screenshot 2025-12-06 at 19-12-27 NV Toolset.png

as simple as putting the directory for my some of my gens, now i can simply work on it and then exported in a (img/txt) format .

might also add an exporting option as embedded captions with the images themselves for sake of archiving.


Tag manager

so far I've manage to make "Tag Manager" module, its basically my own spin of the "sd-tagtool" (one my most used tools by far) .

here's what the "Tag Manager" module looks like so far

Screenshot 2025-12-04 at 21-03-50 NV Toolset.png

as you might have noticed i used the same layout as sd-tagtool as I've come get used to seeing it but added some features that always liked to have such as:

  • tag tag counting and sorting

  • replaced order number with:

    • push to first

    • push to last

    • swap tag order

  • tag shuffling

  • dark mode ☕

  • added an image view (zoomed view)

  • delete image quick action

  • added hotkeys

and you might notice some the tags has icons, yes those are tag groups (editable txt files) that can help with quick selection at glance, helpful when training styles or character loras .

and I made so that I can view them better with coloring option in "ViewOptions"

Screenshot 2025-12-04 210452.jpgScreenshot 2025-12-04 at 21-04-23 NV Toolset.png

I can also swap between color modes on the fly to better visualize my data tags

Screenshot 2025-12-04 210519.jpg

this is "heat color mode" based on the tag frequency tire

Screenshot 2025-12-04 at 21-06-22 NV Toolset.png

and this is the "user color mode" where it shows custom picked colors

picking colors is as easy as selecting the tags and choosing one from the color menu , or even better by pressing a number key from [1] to [9] , or [0] for clearing it

now here's the most exiting part yet ! I made it so that the colors could be used for quick selection in bulk

Screenshot 2025-12-04 210717.jpg

there's also other quick selection options like by tag groups or tag frequency

Screenshot 2025-12-04 210742.jpg

selected tags then can used for any sort action like [insertion, withdrawal, total removal, reorder, coloring ...]

now here's the tag creation dialog, so far its a basic one with no autocomplete yet

Screenshot 2025-12-04 210817.jpg

Img viewer

a simple image viewer to better see the selected data

Screenshot 2025-12-04 210906.jpg

Filtering

filtering is simple as toggling the filter on/off with tags selected

Screenshot 2025-12-04 210929.jpg

by default the filter does not hide image until the option "hide filtered images" is toggled

Screenshot 2025-12-04 at 21-11-02 NV Toolset.png

this is the filter with "hide filtered images" OFF

Screenshot 2025-12-04 at 21-11-22 NV Toolset.png

this is the filter with "hide filtered images" ON, both can be useful depending on the use case

also here are the hotkeys (so far)

Screenshot 2025-12-04 211150.jpg

finally the export options

Screenshot 2025-12-04 211231.jpg

the default "export" download the dataset as Zip and keep things non-destructive , but there's also an option to apply changes directly to work directory also useful for doing more work on the dataset.

also note that "load" button actually copies the dataset files to the "work directory" to be accessible by the project.


so yeah I'm just so proud of this project that i wanted to share my progress on it 💚

17