Title: LORA Data Tool - Ultimate Dataset Builder
(Dual-Engine AI)
LORA Data Tool is a powerful, portable, and all-in-one solution for preparing high-quality datasets for AI training. Designed specifically for LoRA creators, this tool automates image resizing and captioning in a single, streamlined workflow.
Key Features:
- Dual-Engine AI Captioning: Switch between the analytical precision of Microsoftβs Florence-2 and the natural, fluid storytelling of Moondream2.
- Smart Model Management:
* Florence-2 is pre-integrated and ready to use out of the box.
* Moondream2 is automatically downloaded into its dedicated folder upon first use, ensuring a seamless setup.
- Three Detail Levels:
Generate "Short", "Medium", or "Long" captions tailored to your specific training needs.
- Intelligent Batch Processing:
Scale thousands of images to standard resolutions (512, 768, 1024) and generate matching .txt files simultaneously.
- Quick Analyzer Tab:
Instantly test and compare both AI models on a single image before committing to a full batch process.
- Integrated Data Auditor:
A dedicated interface to review, manually edit, or delete images and captions with lightning speed.
- Zero Installation:
Optimized for portable environments with automatic local management of dependencies.
Why choose LORA Data Tool?
Dataset quality is 90% of the battle in LoRA training. This tool allows you to blend Florence-2βs cataloging capabilities with Moondream2βs artistic eye, resulting in captions that help the model understand not just "what" is in the image, but "how" it is composed.
- Original Mac/Linux Core Scripts: SarcasticTofu
TROUBLESHOOTING & TIPS
- First Run Delay:
When you select Moondream2 for the first time, the application may appear "frozen" for a few minutes. This is normal! It is downloading approximately 3GB of model weights. Check your console/terminal to monitor the download progress.
- Broken Downloads: If Moondream2 fails to load, delete the 'moondream' folder in your directory and restart the app to trigger a fresh download.
===========================================================
π Version 3.0: Moondream2 Phase 1
This version introduces Moondream2 for single-image analysis while maintaining the classic Florence-2 automation for folders. Perfect for those who want the new UI features with standard batch processing.
Key Updates:
Quick Analyzer (Tab 3): High-quality Moondream2 support for single images.
Anti-Freeze Tech: New system-level clipboard extraction (no more crashes with Windows Photos).
Dataset Mode: Auto-saves matching Image + TXT pairs, ready for LORA training.
Smart Navigation: "Open Folder" button to instantly access your files.
Universal Input: Seamlessly switch between Clipboard (Paste) and Drag & Drop.
π Credits & Development
Windows Port, CPU Optimization & GUI Development: Jazara930
New Features: Full Moondream2 Integration, Multi-resolution Scaling, Quick Analyzer, Universal Input (Paste/Drag & Drop, New keys): by Jazara930
Original Logic (Mac/Linux): SarcasticTofu
πΎ Storage Note
Note: The Moondream2 model (approx. 800MB) will be automatically downloaded to your Windows User profile cache (.cache/huggingface) on the first run. Ensure you have enough space on your C: drive.
π Coming Soon: Full Moondream2 batch processing in the next update! Stay tuned.
π LORA Data Builder 2.0 (v2.0 FINAL)
This is the definitive evolution of the Windows tagging tool. This software has been engineered as a professional Portable Suite for LoRA dataset preparation, designed to run entirely on CPU.
Why CPU-Only?
Many users face "WinError 126" or CUDA version conflicts with GPU-based tools. This version eliminates those issues, providing a stable, reliable, and fast experience on any modern Windows machine, regardless of your graphics card.
π Key New Features in v2.0
The workflow has been completely modernized:
π Paste from Clipboard: The ultimate time-saver. Simply copy an image from your browser and paste it directly into the tool for instant analysis.
πΎ Live Text Editing: The Auditor is now interactive. Modify the generated prompt and save it directly to the
.txtfile with one click.β‘ 5-Beam Precision Engine: We use a 5-beam search (num_beams=5) on the Florence-2 model to ensure the highest descriptive accuracy.
π« Popup-Free Workflow: All actions and "β³ ANALYZING..." statuses are displayed in a sleek, non-blocking animated status bar.
π§ Performance & Hardware
Optimized with Torch 2.9.1, the tool leverages modern CPU instruction sets:
High-End Desktop: Extremely fast batch processing.
Modern Laptops (Core Ultra / i9 / i7): High-speed inference (2-3 seconds for "Long" descriptions).
Standard Hardware: Solid and reliable performance on any modern PC.
π Professional Dataset Management
Deterministic Generation: Consistent tagging style across your entire dataset.
Smart Scaling: Automated high-quality resizing (512, 768, 1024px) using Lanczos filtering.
Bulk Tools: Instant Search & Replace tags across thousands of files.
π¦ Fully Portable & Offline
Zero Setup: Includes the full portable Python environment and the Florence-2 model pre-installed.
Total Privacy: Everything stays local on your machine.
π Credits
Original Logic: SarcasticTofu
Windows Port, CPU Optimization & GUI Development: Jazara930
=========================================================
LORA DATA BUILDER - (v1.4 CPU/WIN) Portable
=========================================================
This v1.4 update is a major stability release. It refines the multi-resolution engine and introduces the Quick Analyzer, making it the most complete and stable portable tool for LoRA dataset preparation.
π NEW IN VERSION 1.4:
* Quick Analyzer Tab: Instantly caption single images to test styles before bulk processing.
* Web Drag & Drop: Drag images directly from Civitai into the tool for instant analysis.
* Full English UI: The entire interface is now in English for global accessibility.
* Stability Fix: Improved GUI engine with "Groove" relief to prevent startup crashes.
* Smart Selection: Click the analyzer area to browse files or use Ctrl+C to copy generated prompts.
π CORE FEATURES (v1.3 Legacy):
* Resolution Selector: Choose between 512px, 768px, or 1024px.
* Smart Scaling: Automatic shortest-side calculation preserving original aspect ratios.
* Dynamic Folders: Automatic organization into resolution-specific subfolders.
π WHAT IS INCLUDED:
Pre-configured *Python 3.11.9** environment.
Pre-downloaded *Florence-2 Base** model (integrated for offline use).
* Startup & Repair Batch Files for immediate execution.
π HOW TO START:
1. Extract the folder to your desktop (**DO NOT** run from inside ZIP).
2. Double-click on: "Avvio_Lora_Tool_v1.4_MultiRes.bat".
3. To Test: Go to "Quick Analyzer" and drag an image from Civitai or your PC.
4. To Batch: Select your folder, resolution, and click "Scale + Caption".
----------------------------------------------------------------------------------------------------
β οΈ TECHNICAL NOTES:
* CPU Mode: Guaranteed stability on all Windows systems (no specialized GPU drivers required).
* Web Drop: Optimized for Civitai. If a specific website blocks dragging, simply save the image and drag it from your PC.
* Copying: Select the generated text in the Analyzer and press Ctrl+C.
CREDITS:
Porting, English UI - Multiresolutions & Quick Analyzer: Jazara930
Original Scripts: SarcasticTofu
=========================================================
Title: LoRA Data Tool v1.3 - Multi-Resolution Ed. (Universal Windows & CPU Portable)
Description:
This tool is an evolved, Windows-native, and Portable version of the original LoRA Data Builder. After the success of the first porting, this v1.3 "Multi-Resolution Edition" introduces the most requested feature: the ability to choose your target scaling resolution.
Whether you are training a LoRA at 512px, 768px, or 1024px, this tool handles everything automatically, ensuring your dataset is perfectly prepared with high-quality captions and correctly scaled images.
π Key Features in v1.3:
NEW: Multi-Resolution Selector: Choose between 512, 768, or 1024 from a simple dropdown menu.
Smart Scaling: The tool automatically calculates the correct aspect ratio based on your choice, scaling the shortest side and keeping the image proportions intact.
Dynamic Folders: Images are automatically saved in resolution-specific subfolders (e.g., /512_scaled/) to keep your workspace organized.
No GPU Required: Fully optimized for CPU (Florence-2 model), making it accessible to everyone, even on laptops or older systems.
Portable & Zero Setup: No Python environment to configure. Just unzip and run the executable.
Integrated Auditor: Review, edit, or delete captions and images in real-time through the built-in gallery.
π How to use:
Select Folder: Choose the folder containing your raw images.
Set Resolution: Pick your target resolution (512, 768, or 1024).
Caption Style: Choose between Short, Medium, or Long (Civitai style) descriptions.
Process: Click "Scale + Caption" and let the tool do the magic.
π Credits & Resources:
This project is a collaborative evolution of the community's effort:
Original Linux Scripts: Huge thanks to SarcasticTofu for the original concept and scripts. You can find his original Linux version HERE: https://civitai.com/models/2219046?modelVersionId=2498241.
Previous Version: If you are looking for the legacy 1024px-only version or GPU (3090) Version, it's available HERE: https://civitai.com/models/2266157?modelVersionId=2550818.
Porting & Optimization: Developed and refined by Jazara930.
