Sign In

Tomboys for FLUX

84
1.1k
521
49
Verified:
SafeTensor
Type
LoRA
Stats
395
254
1k
Reviews
Published
Sep 2, 2024
Base Model
Flux.1 D
Training
Steps: 9,938
Epochs: 10
Usage Tips
Strength: 1
Trigger Words
tomboy
Hash
AutoV2
E0757680BA
Holiday 2024: 12 lights
eurotaku's Avatar
eurotaku
The FLUX.1 [dev] Model is licensed by Black Forest Labs. Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs. Inc.
IN NO EVENT SHALL BLACK FOREST LABS, INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.

Update3: Oh boy, is it working well now! And all it took were a few more training steps (10k to be precise so basically 4 times the amount :D). Makes me wonder, if that would be true for the earlier attempts, too?

Update2: Wow, the joycaptioned try is a real surprise: it is definitely the worst one at reproducing the tomboy likeness from the dataset, and on top of that it seems to be the horniest attempt, yet (dataset is sfw). maybe the t5-text-encoder really is that hard at learning new stuff, so more steps should help with that, right? well, let's see!

Update: Hm, it seems i wasn't totally wrong when suspecting the duplication artifacts might have to do with the training resolution, the 1024x1024 shows way less of them (but occasionally still present) while retaining the more complex backgrounds/more variable compositions. Let's see what flux does with some natural language captions next.

Okay, first findings after 3 trainings: a completely uncaptioned dataset works surprisingly well with flux. So well, that indeed the introduction of only one caption (trigger "tomboy") didn't just not enhance results, but even made them slightly worse. Although the tiny bit of cherry-picking i had to do for the 2nd showcase might have been just bad seed luck.

The 3rd one with classic booru style tags, though, is a whole other experience. On the one hand especially for the sunbathing pictures it produced way better environments with more consistent beaches or pools etc. On the other hand it introduced heavy body salad, ok not SD3-levels, but i included some less than successful images in the showcase as a reference. It seems to be way more prone to include more than one person, maybe tied to the 512x512 training resolution? Will retry with 1024x1024 to compare next.

The word "tomboy" is a compound word which combines "tom" with "boy". Though this word is now used to refer to "boy-like girls", the etymology suggests the meaning of tomboy has changed drastically over time. In 1533, according to the Oxford Dictionary of English, "tomboy" was used to mean a "rude, boisterous or forward boy". By the 1570s, however, "tomboy” had taken on the meaning of a "bold or immodest woman", finally, in the late 1590s and early 1600s, the term morphed into its current meaning: "a girl who behaves like a spirited or boisterous boy; a wild romping girl."

from wikipedia

There you have it: Tomboys, challenging gender stereotypes since 1600. And looking damn good at the same time. All the more outrageous that Flux seems to have no idea what a tomboy is. But this changes now with this lora (hopefully).