Sign In

Flux-American-Dollar-Bills

9
53
0
3
Verified:
SafeTensor
Type
LoRA
Stats
53
0
Reviews
Published
Nov 19, 2024
Base Model
Flux.1 D
Training
Steps: 375
Epochs: 5
Usage Tips
Clip Skip: 1
Strength: 1
Hash
AutoV2
D8D820CB1F
Created on Civitai
The FLUX.1 [dev] Model is licensed by Black Forest Labs. Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs. Inc.
IN NO EVENT SHALL BLACK FOREST LABS, INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.

American Dollar Bills is the first LoRA I trained for Flux. This project was created with the goal of generating realistic images of American banknotes based on the main denominations of $1, $2, $5, $10, $20, $50, and $100, both front and back. It started as an experiment with a good idea but ended with an imperfect result. However, thanks to my passion for learning and my desire to improve my skills in model creation, despite my initial inexperience, I understood what to do in the future to achieve ...

Dataset Creation

Image Search and Selection

I searched online for images of American banknotes (front and back) corresponding to the main denominations from $1 to $100. I collected a total of 150 images, including some in UHD resolution, ensuring a variety of designs and contexts.

Resizing

Using the Birme website, I quickly converted, resized, and cropped all the images into a square 1024x1024 format.

Study of Banknotes

Since I am not American and have never been to the United States, I spent time studying (though not in-depth) the design of U.S. banknotes, focusing on the details present on each denomination, both front and back.

Caption Creation

I used BLIP2 on Kohya to automatically generate captions for each image in the dataset. I then manually reviewed all captions using Google Translate for accuracy. Finally, with the help of ChatGPT, I optimized the captions, making them more precise and varied (for example, “a one-dollar banknote,” “a 1-dollar banknote,” etc.).

Training on Civitai

I decided to use the Civitai platform for training because it proved to be simpler and more efficient for me compared to Flux Gym.

However, after completing the training and testing the LoRA with the first generated images, I realized there was a noticeable imbalance in the dataset:

- Most of the images were of $100 banknotes.

- Other denominations, like the $10 bill, were completely missing.

This imbalance clearly reflected in the generated results: the LoRA tends to generate $100 banknotes, even with generic prompts.

Dataset Analysis and Script Creation

To understand the dataset balancing errors, I created a simple Python script to analyze the captions. This tool searches for relevant keywords (such as “dollar,” “banknote,” or denominations like 1, 5, 10, 20, etc.) using a file called yes_word.txt. At the same time, it ignores unnecessary words (such as conjunctions and prepositions) using a file called no_word.txt.

The script generates a report that highlights:

- The 10 most frequent words.

- The 10 most frequently used numerical digits.

- A complete list of keywords found in the yes_word.txt file, sorted by frequency and alphabetically.

This allowed me to verify that the dataset was imbalanced, with an excess of $100 banknote images and the absence of some denominations, such as the $10 bill.

Performance and Observations

In my tests, I noticed a significant improvement when using this LoRA with simple positive prompts compared to similar generations without it. For example:

- When applied as a texture, it covers objects more evenly, creating patterns that appear more refined and believable. By specifying certain features (such as the material type or texture), it blends perfectly into the object structure, making the texture look like part of the material itself.

- When used on characters, it creates more detailed and realistic clothing.

- Overall, this LoRA enhances image quality, especially in realistic contexts, producing more polished and visually appealing final results.

Final Considerations

American Dollar Bills is an experiment that has taught me a lot about the process of creating and training a LoRA. Despite its limitations and obvious balancing errors, it represents an important step in improving my future models.

Main Benefits:

- Highly credible and detailed results in realistic contexts, both for textures and for clothing on characters.

- Can be applied to objects and people.

- It’s only 18.3 MB in size.

What to Expect from This LoRA:

- It is designed to generate realistic images of American banknotes, with a particular focus on $100 bills (due to the dataset imbalance).

- It can be used with weights between 1 and 2 in ComfyUI for optimal results.

- You can conduct further tests to see what it can do with a good prompt related to banknotes (rolled, in stacks, bundles, piles, etc.).

Areas for Improvement:

- For a better version in the future, I will need to balance the dataset before training, ensuring all banknote denominations are included more evenly.

- I will definitely need to integrate feedback from more experienced users to improve the training and analysis process.

- Developing better tools to analyze datasets before starting the training will be crucial.

If you’ve ever wanted to generate detailed images of American banknotes or incorporate realistic textures into your projects, give American Dollar Bills a try! Your feedback will be vital in improving future developments.