Sign In

Auto Tagger

5
25
2
Updated: Apr 20, 2025
tooltaggerlabelauto tagger
Type
Other
Stats
25
0
Reviews
Published
Apr 20, 2025
Base Model
Other
Hash
AutoV2
68E3E7B249
default creator card background decoration
custos's Avatar
custos

**Please install Python before use!!**     

This fully automatic labeling tool uses today's most powerful large language models (LLMs) to automatically generate labels for your training dataset. Testing shows that labels generated using Gemini 2.0 Flash achieve accuracy comparable to human manual tagging.

1. **Ensure you have Python installed.** This labeling tool provides two versions: one for the Google Gemini API and one for the OpenAI API. With the Gemini API version, you can use any of Google's models. If you have a Google account, you can get a free API key from the Gemini AI Studio – just look up a tutorial online for the specifics! Google currently offers a free quota per account, allowing up to 1500 Gemini 2.0 Flash calls daily even without paying, which is totally enough for image labeling! (*Note: Using the Gemini API might require a VPN or similar tool if you're in certain regions.*) If you have an OpenAI API key (or any third-party API compatible with OpenAI's), just use the OpenAI API version.

2. **After unzipping, click start_webui.bat.** On the first run, it will automatically install necessary dependencies. Please ensure you have at least 500MB of free space. Once dependencies are installed, the web UI will open automatically. (*Make sure your VPN/proxy is turned off when launching the program, otherwise the web UI might fail to start.*)

3. **This tool supports both automatic batch image labeling and manual single image labeling.**

To run *automatic batch image labeling**, enter your training set path (e.g., c:\my_training_set) into the 'Images Folder' field, then click 'Batch process images' on the page to start. The generated labels will be saved in a .txt file with the same name as the corresponding image, and this file will be saved within your training set folder. To interrupt the process, click the 'Stop' button next to it.

*Automatic batch image labeling** also supports multi-threading, processing up to 40 images per batch. If your API's RPM (requests per minute) limit is high enough, a training set of 1000 images can be labeled by running just 25 batches automatically, potentially taking only a few minutes. You can set the number of images per batch in the 'Max Concurrent Images (Batch Size)' field. ('Interval Between Batches (seconds)' is the time delay between batches; the default value of 5 is usually fine). If you prefer to process images one by one sequentially, set 'Max Concurrent Images (Batch Size)' to 1 and 'Interval Between Batches (seconds)' to 0.

4. ****This tool also features an 'In-Context Learning' mode**** (look for pages labeled 'incontext-learning'). This allows you to upload two reference images and their corresponding labels. The LLM learns the labeling style from these examples and applies it to your other images. If you want the LLM to label according to your specific requirements (e.g., outputting natural language tags or Danbooru-style tags), using the 'In-Context Learning' page is highly recommended.

5. **The tool also supports Dataset Pre-Processing.** Here, you can enter a target total pixel count between 400,000 and 1,000,000 (600,000 or higher is recommended). Click 'Run', and the tool will crop the images in your training set, maintaining their original aspect ratio, to match this pixel count. Smaller images can effectively save on labeling costs. (This function will not affect your original dataset in any way, so feel free to use it)