Type | |
Stats | 53 0 |
Reviews | (9) |
Published | Nov 19, 2024 |
Base Model | |
Training | Steps: 375 Epochs: 5 |
Usage Tips | Clip Skip: 1 Strength: 1 |
Hash | AutoV2 D8D820CB1F |
American Dollar Bills is the first LoRA I trained for Flux. This project was created with the goal of generating realistic images of American banknotes based on the main denominations of $1, $2, $5, $10, $20, $50, and $100, both front and back. It started as an experiment with a good idea but ended with an imperfect result. However, thanks to my passion for learning and my desire to improve my skills in model creation, despite my initial inexperience, I understood what to do in the future to achieve ...
Dataset Creation
Image Search and Selection
I searched online for images of American banknotes (front and back) corresponding to the main denominations from $1 to $100. I collected a total of 150 images, including some in UHD resolution, ensuring a variety of designs and contexts.
Resizing
Using the Birme website, I quickly converted, resized, and cropped all the images into a square 1024x1024 format.
Study of Banknotes
Since I am not American and have never been to the United States, I spent time studying (though not in-depth) the design of U.S. banknotes, focusing on the details present on each denomination, both front and back.
Caption Creation
I used BLIP2 on Kohya to automatically generate captions for each image in the dataset. I then manually reviewed all captions using Google Translate for accuracy. Finally, with the help of ChatGPT, I optimized the captions, making them more precise and varied (for example, “a one-dollar banknote,” “a 1-dollar banknote,” etc.).
Training on Civitai
I decided to use the Civitai platform for training because it proved to be simpler and more efficient for me compared to Flux Gym.
However, after completing the training and testing the LoRA with the first generated images, I realized there was a noticeable imbalance in the dataset:
- Most of the images were of $100 banknotes.
- Other denominations, like the $10 bill, were completely missing.
This imbalance clearly reflected in the generated results: the LoRA tends to generate $100 banknotes, even with generic prompts.
Dataset Analysis and Script Creation
To understand the dataset balancing errors, I created a simple Python script to analyze the captions. This tool searches for relevant keywords (such as “dollar,” “banknote,” or denominations like 1, 5, 10, 20, etc.) using a file called yes_word.txt
. At the same time, it ignores unnecessary words (such as conjunctions and prepositions) using a file called no_word.txt
.
The script generates a report that highlights:
- The 10 most frequent words.
- The 10 most frequently used numerical digits.
- A complete list of keywords found in the yes_word.txt
file, sorted by frequency and alphabetically.
This allowed me to verify that the dataset was imbalanced, with an excess of $100 banknote images and the absence of some denominations, such as the $10 bill.
Performance and Observations
In my tests, I noticed a significant improvement when using this LoRA with simple positive prompts compared to similar generations without it. For example:
- When applied as a texture, it covers objects more evenly, creating patterns that appear more refined and believable. By specifying certain features (such as the material type or texture), it blends perfectly into the object structure, making the texture look like part of the material itself.
- When used on characters, it creates more detailed and realistic clothing.
- Overall, this LoRA enhances image quality, especially in realistic contexts, producing more polished and visually appealing final results.
Final Considerations
American Dollar Bills is an experiment that has taught me a lot about the process of creating and training a LoRA. Despite its limitations and obvious balancing errors, it represents an important step in improving my future models.
Main Benefits:
- Highly credible and detailed results in realistic contexts, both for textures and for clothing on characters.
- Can be applied to objects and people.
- It’s only 18.3 MB in size.
What to Expect from This LoRA:
- It is designed to generate realistic images of American banknotes, with a particular focus on $100 bills (due to the dataset imbalance).
- It can be used with weights between 1 and 2 in ComfyUI for optimal results.
- You can conduct further tests to see what it can do with a good prompt related to banknotes (rolled, in stacks, bundles, piles, etc.).
Areas for Improvement:
- For a better version in the future, I will need to balance the dataset before training, ensuring all banknote denominations are included more evenly.
- I will definitely need to integrate feedback from more experienced users to improve the training and analysis process.
- Developing better tools to analyze datasets before starting the training will be crucial.
If you’ve ever wanted to generate detailed images of American banknotes or incorporate realistic textures into your projects, give American Dollar Bills a try! Your feedback will be vital in improving future developments.