<p>I wanted to do more Flux training experiments, and I got some ⚡⚡⚡ buzz donated ⚡⚡⚡ from a user to run some character experiments, so run the experiments I did!</p><p>This training still focuses on different caption types, to learn and spread more knowledge about training LoRAs with Flux. Specifically character training.</p><p><a target="_blank" rel="ugc" href="https://civitai.com/models/677814">Shadowheart Flux Character LoRA</a></p><p><img src="https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/81e7cf2b-c55e-493a-a9ad-667f0373d644/width=525/81e7cf2b-c55e-493a-a9ad-667f0373d644.jpeg" />I've uploaded each version of the model as a separate version. In this article, similar to my previous <a target="_blank" rel="ugc" href="https://civitai.com/articles/6792/flux-captioning-differences-training-diary">Flux Training captioning diary</a>, I will talk about the different settings and my observations.</p><hr /><h1 id="the-dataset-qy2fvs727">The Dataset</h1><p>Since this is a very popular character it already has several models on CivitAI. I put together a dataset using generations from those models as the base, and with a few screenshots from the game, and a few fan-arts online. 30 images in total were used.</p><ul><li><p>11 images were in anime style</p></li><li><p>10 images were in a semi-realistic (2.8D) style</p></li><li><p>9 anime were 3D / ingame graphics style</p></li></ul><hr /><h1 id="joycaption-notrigger-s2i3rf3ez"><a target="_blank" rel="ugc" href="https://civitai.com/models/677814?modelVersionId=758756">JoyCaption-NoTrigger</a></h1><pre><code>Steps: 1050
Resolution: 512
Batch Size: 2
Unet LR: 0.0005
Network Dim: 2
Network Alpha: 16
Optimizer: AdamW8Bit</code></pre><p>This version used the recommended settings from the <a rel="ugc" href="https://education.civitai.com/using-civitai-the-on-site-lora-trainer">CivitAI Flux Training Documentation</a>.</p><p>This was trained on complex captions with very long descriptions, without using a trigger word to activate the character.</p><h3 id="example-caption-jxxv6ipam">Example caption</h3><pre><code>This is a highly detailed digital illustration in fantasy art style. The subject female elf with pointed ears, fair skin, and slender, athletic build. She has long, dark hair styled high ponytail single, thick braid. Her eyes are striking green, she wears an elaborate, ornate headpiece large, green gem the center. attire form-fitting, sleeveless outfit deep neckline that reveals significant amount of cleavage, accentuating her medium-sized breasts. top made dark, glossy material gold accents form V-shape on chest. also tight, black pants highlight curvaceous hips legs.

The background serene, twilight sky gradient transitioning from blue at to soft orange near horizon, suggesting either sunrise or sunset. There faint, distant mountains few floating stars, adding mystical atmosphere. lighting even, giving image smooth, polished appearance. overall mood one mystique, focus character's confident regal presence.</code></pre><h3 id="model-results-dx57rvy17">Model Results</h3><p>This version is versatile, but you need to use a couple of keywords to trigger the character's look. You have to describe an elf-like appearance, pointy ears, or a fantasy character. Perhaps add a crown and armor, and you'll get the entire character.</p><p>The character is also very customizable. Gender can be swapped, different poses can be achieved as well as completely different clothes and appearances.</p><p>However, the appearance must come early in the prompt, before the Shadowheart character explanation.</p><p><img src="https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/714c329e-0253-43ad-9ac1-96550830e823/width=525/714c329e-0253-43ad-9ac1-96550830e823.jpeg" /></p><hr /><h1 id="wd14-notrigger-jx5idtp6q"><a target="_blank" rel="ugc" href="https://civitai.com/models/677814">WD14-NoTrigger</a></h1><pre><code>Steps: 1050
Resolution: 512
Batch Size: 2
Unet LR: 0.0005
Network Dim: 2
Network Alpha: 16
Optimizer: AdamW8Bit</code></pre><p>This was trained on WD14-style tag captions without using a trigger word.</p><h3 id="example-caption-cowes65pq">Example caption</h3><pre><code>1girl, solo, long hair, breasts, blush, looking at viewer, bangs, blue eyes, brown hair, large breasts, black hair, photoshop \(medium\), shirt, gloves, dress, long sleeves, original, medium breasts, jewelry, green eyes, closed mouth, standing, ponytail, braid, upper body, short sleeves, cowboy shot, sidelocks, earrings, sky, parted lips,</code></pre><h3 id="model-results-aopzmjkq4">Model Results</h3><p>This model did not capture the character so well if you prompt it with long and complex captions. This makes a lot of sense, since the captions are in comma-separated tag-based form.</p><p>If you instead prompt it with simpler tag-based words however, the model finds the character just fine.</p><p>I noticed that even if you don't prompt for it, the model adds a bit more anime style to the outputs. This is likely because it matches and merges the training captions with existing similar captions, which I could guess is how some of the core Flux model's anime training is done. It seems like a reasonable captioning strategy.</p><p>Example of simple generation prompt that captures enough of the character:</p><pre><code>1girl, bangs, black hair, ornament, jewelry, standing, armor, green eyes, nose scar, makeup</code></pre><p><img src="https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/47e5e7fc-dcf2-4622-94c8-3d2d139dd2af/width=525/47e5e7fc-dcf2-4622-94c8-3d2d139dd2af.jpeg" /></p><hr /><h1 id="nocaption-trigger-rto9757tr"><a target="_blank" rel="ugc" href="https://civitai.com/models/677814?modelVersionId=758740">NoCaption-Trigger</a></h1><pre><code>Steps: 1050
Resolution: 512
Batch Size: 2
Unet LR: 0.0005
Network Dim: 2
Network Alpha: 16
Optimizer: AdamW8Bit</code></pre><p>This was trained on no captions, except for using a trigger word to activate the character. I'm using the trigger word <code>sh4d0wh34rt</code> to make sure the model does not have any existing knowledge of the Shadowheart character, and also to make sure it doesn't bleed in the words "shadow" and "heart".</p><h3 id="example-caption-2fds9r2b2">Example caption</h3><pre><code>sh4d0wh34rt</code></pre><h3 id="model-results-x3tqrkhve">Model Results</h3><p>This works really well. I can use only the prompt <code>A realistic photo of a Sh4d0wh34rt woman</code> and it creates a reasonable version of the character. Then using descriptive words matching her outfit or design brings out even more of the trained data.</p><p>The model is flexible when it comes to costumes, actions, while still maintaining the character design. Gender-swapping is weaker than the JoyCaption models though.</p><p><img src="https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/5409c987-d91e-42ad-953e-28049f887ece/width=525/5409c987-d91e-42ad-953e-28049f887ece.jpeg" /></p><hr /><h1 id="joycaption-trigger-8b2bvjrgv"><a target="_blank" rel="ugc" href="https://civitai.com/models/677814?modelVersionId=758786">Joycaption-Trigger</a></h1><pre><code>Steps: 1050
Resolution: 1024
Batch Size: 2
Unet LR: 0.001
Network Dim: 4
Network Alpha: 32
Optimizer: AdamW8Bit</code></pre><p>This version was trained on complex captions with very long descriptions, as well as using the trigger word <code>sh4d0wh34rt</code> at the start of the caption.</p><h3 id="example-caption-m5nuix6h8">Example caption</h3><pre><code>A sh4d0wh34rt female character. This image is a highly detailed digital illustration depicting fantasy elf-like character with pointed ears, fair skin, and long, dark hair. The subject young woman serene expression, her eyes closed lips slightly parted. She has delicate, feminine face hint of freckles on nose cheeks. Her skin smooth flushed, adding to the ethereal, otherworldly feel artwork.

She wearing headpiece that resembles crown intricate, metallic patterns, necklace adorned beads pendants add touch attire. background textured, blend deep blues purples, creating mystical atmosphere. lighting soft moody, casting shadows highlight contours textures hair clothing.

Her right hand gently touching cheek, fingers spread, subtle reflection light nails. overall style realistic focus fine details textures, making appear lifelike immersive. artist's signature visible side image, personal piece.</code></pre><h3 id="model-results-i5cvk2ivc">Model Results</h3><p>This model surprised me at first. It did not at all produce good outputs at all. I do not think it has anything to do with the captioning, but rather the fact that I increased the Learning Rate to 0.001 from the default of 0.0005. This is 2x the original LR, so it makes sense that it only picks up on the broad strokes and not the details.</p><p>The costume and key character designs are there. Prompting with <code>A fantasy artwork of a Sh4d0wh34rt woman.</code> gives you an armored female character with pointy ears, black or gray hair, green eyes, sometimes a diadem and sometimes the right nose and mouth for the character. So it has started learning, but not at a levels where it manages to capture the details.</p><p>With additional prompting you can get more out of the model. The model could be used to generate imperfect versions of the character (like realistic people cosplaying as the character). But overall, the model is not useful compared to the other versions.</p><p><img src="https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/3c968cf4-d977-48bc-83bd-a8cfb3a72efd/width=525/3c968cf4-d977-48bc-83bd-a8cfb3a72efd.jpeg" /></p><hr /><h1 id="wd14-trigger-2ox2gmumd"><a rel="ugc" href="https://civitai.com/models/677814?modelVersionId=758795">WD14-Trigger</a></h1><pre><code>Steps: 1050
Resolution: 1024
Batch Size: 2
Unet LR: 0.00025
Network Dim: 4
Network Alpha: 32
Optimizer: AdamW8Bit</code></pre><p>This version was trained on WD14-style comma-separated tagging captions without using the trigger word <code>sh4d0wh34rt</code>. Worth noting is that I experimented with the Learning Rate of the model here. I used 0.00025 instead of the default 0.0005. This is 0.5x the original LR.</p><h3 id="example-caption-ylkvczats">Example caption</h3><pre><code>a sh4d0wh34rt female character, 1girl, solo, long hair, breasts, blush, looking at viewer, smile, bangs, black hair, hair ornament, photoshop \(medium\), shirt, long sleeves, original, jewelry, green eyes, closed mouth, ponytail, braid, upper body, hairband, sidelocks, earrings, parted lips, outdoors, day, pointy ears, blunt bangs, cape, water, armor, twin braids</code></pre><h3 id="model-results-s4sigukbd">Model Results</h3><p>The halved learning rate really shows in the model. You can see that it captures the finer details, like facial features, the scar and freckles on her face. But the armor, diadem and bigger picture details are not there.</p><p>This could be somewhat useful if you are going for a version where you want to modify the big picture of the character, but keep the details, but similar to the <code>Joycaption-Trigger</code>-version of the model above, I think it's just a less useful model than the three first LR 0.0005 models.</p><p>Learning Rate Note</p><p>I did also change the resolution and network dimension and alpha on the 2 "failed" models. So it could also be those factors giving us trouble here. I do however think it's all about the LR in this case.</p><p><img src="https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/066e2428-36c9-4df7-b11e-d54712bc5236/width=525/066e2428-36c9-4df7-b11e-d54712bc5236.jpeg" /></p><hr /><h1 id="conclusion-5oj0l7uu1">Conclusion</h1><p>Learning Rate matters! A lot!</p><p><img src="https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/d3177cfa-4a6b-4867-a09b-f14aa10a4383/width=525/d3177cfa-4a6b-4867-a09b-f14aa10a4383.jpeg" />The relative learning rates, compared to the "default" of 0.0005.</p><p>Using the CivitAI onsite trainer, you'll get some good defaults. Use them.</p><hr /><h1 id="recommended-training-meqmby48e">Recommended training</h1><p>I have two favorites from these trained versions.</p><h3 id="flexibility-l9htas0ne">Flexibility</h3><p>The <a target="_blank" rel="ugc" href="https://civitai.com/models/677814?modelVersionId=758740"><strong>NoCaption-Trigger</strong></a>-version wins this category. This model right away gives you the character, and it's the easiest to transform into something else. Other costumes and poses.</p><p>The drawback is that you need to prompt for her original outfit to get it.</p><p>Best Captured Likeness:</p><p>The <a target="_blank" rel="ugc" href="https://civitai.com/models/677814?modelVersionId=758756"><strong>JoyCaption-NoTrigger</strong></a>-version was the best at reproducing the character in full. </p><p><strong>Worth noting is that I believe that using JoyCaption with a trigger would result in an even stronger model, when training on the appropriate Learning Rate.</strong></p><p></p><h1 id="did-you-say-you-got-some-sponsored-buzz-egi49ako2"><span style="color:#fd7e14">Did you say you got some sponsored buzz?</span></h1><p>Yes! This is how we get cool articles like this. More ⚡⚡⚡ = more cool training like this!</p><p>I don't mind if you drop a handful of buzzes with the button right below here! Press it and watch the ⚡ go up!</p>

xygridCompare.jpg

Flux Character Caption Differences - Training Diary

physical violence

weapon violence

wide hips

revealing clothes

downblouse

convenient censoring

pg-13

corpses

suggestive

oral invitation

pg13

sexy

huge breasts

thick thighs

sexual situations

male nudity

disturbing

male swimwear or underwear

female swimwear or underwear

partial nudity

undressed

female nudity

breasts out

exposed female nipple

breast out

lingerie

male underwear

hair over breasts

female swimwear

gigantic breasts

no panties

graphic violence or gore

covered nipples

huge butt

strapless leotard

sitting on face

emaciated bodies

one breast out

female underwear

nude

nsfw

graphic male nudity

adult toys

illustrated explicit nudity

nudity

graphic female nudity

hentai

futanari

porn

sexual intent

genitals

peeing

vore

oral

sexual activity

anal

blowjob

dildo riding

incest

hanging

hate symbols

nazi party

white supremacy

diapers

scat

self injury

hate speech

urine

extremist

child on child

latex clothing

swimwear

bukkake

fellatio

cumshot

implied fellatio

eat_cum

cumdrip

cum in pussy

cum on face

after fellatio

cum on hair

cum on body

cum on tongue

cum on hands

cum in mouth

triple fellatio

autofellatio

fucked silly

cum on pussy