Sign In

Baking Outfit LoRAs with Very Few Images: My Adventures in Creating VG Armor Models

1

Baking Outfit LoRAs with Very Few Images: My Adventures in Creating VG Armor Models

INTRODUCTION

Welcome to my first article! This is me writing down what I've learned from creating Final Fantasy XI armor loras using only a few crappy screenshots and maybe a fanart or two.

I'm still learning, so if this interests you, check back in for updates every once in a while, and please provide corrections if you see any mistakes.

Much of my learning started with this article, so give it a read for a fuller understanding of making loras on civitai: https://civitai.com/articles/9005/a-detailed-beginners-guide-to-lora-training-on-civitais-trainer. Great job, HidFig!

Just to clarify, if you have a decent amount of screenshots, creating synthetic data is not necessary; just make sure to tag the screenshots as '3d' and the checkpoint will understand on its own how to render non-3d styles from it. Thank you to veteran baker NanashiAnon for mentioning this.

CREATING TRAINING IMAGES

I'm trying to create the Dragoon's first AF armour from the game Final Fantasy XI, an older game with dated graphics. An example of what I'm working with here (and this is AFTER upscaling!):

latest.png

Okay, so we don't have many training images, and the ones we have aren't very good. Not to worry -- we can use what we have to create more.

Nano Banana Pro (NBP) is your friend. Of all the free, easy-use image-gen services I've discovered, it is the most useful for this. Make sure to get your prompt right though, as the free version currently only has two gens per day with Pro, and regular Nano Banana isn't nearly as good. I submitted a clear-as-possible screenshot of the armor with the following prompt:

I have attached a screenshot of the male Dragoon's AF armor from Final Fantasy XI. The screenshot is grainy and pixelated. Please render the outfit in more detail, matching the outfit as closely as possible, concept art style, rich colors, plain white background. Please show the armor from multiple views/angles, landscape orientation, high resolution as possible.

The result looks good:

Gemini_Generated_Image_xdups6xdups6xdup.png

Once we've run out of NBP uses, we can also use Sora to work with the images NBP gave us. Try to make the results as diverse as possible for healthy training data. Some examples of things to do:

  • Render images without the helmet (during baking, make sure to tag them as 'no headgear')

  • Render male and female body types (if the M/F outfits are different in some way, make sure to have a tag that differentiates them; see below)

  • Render different poses

  • For pictures with multiple views, crop each view out into its own image

  • Upscale any small images. If you don't have your own upscaling workflow, you can just use waifu-2x

  • White backgrounds are better than busy backgrounds. Put in the effort to remove as much visual interference as possible from the images. Crop aggressively.

  • And of course if there is good fanart of the outfit, then use that too, but only if it's accurate -- we're going for fidelity here.

An example of my Sora prompt, using the armour art that Nano Banana gave me:

Attached is a picture of armor. Please render a man with a clean-cropped brown beard-mustache like Riker from Star Trek in the armor, striking different random poses. He is not wearing the helmet. White background.

The result. Lookin' good, Riker!

20260130_1001_Image Generation_remix_01kg7xjebrfj4r2t3z4dyk9pjs.png

Using these methods, I went from a few crappy screenshots to ~25 decent images. This isn't a high number by any means, but it's enough to train on.

PRE-BAKING

Jagginess/Pixelation must be fixed prior to baking. You might ask: isn't that a job for the Noise Reduction parameter during baking? Messing with noise reduction during baking can help but it will have undesired effects on the overall sharpness/clarity. It is always better to feed the lora good data. Asking Sora to smooth over the jagginess works; Nano Banana didn't do as good a job.

Tagging is important for making the outfit more 'modular', e.g. adding/removing/changing components of the outfit. Tag everything. Use the auto-tagger for the obvious stuff, but go through each one and tag thoroughly. Set the allowable tags to maximum (30). No idea why it defaults to only 10.

  • As an example, the AI sometimes interpreted the winged helmet as horns; use the tag viewer to look for any spurious tags like 'horns' and delete them.

  • I didn't bother using male/female tags as the only real difference is the thigh cutout, so I used that tag. If it's not necessary to make up your own tag to differentiate the outfit types, then don't do it. This way, for example, users can give a male character the thigh cutout if they want, or a female character the full pants.

BAKING

Normally I use 12 epochs and maybe 4-6 repeats, however for so few training images this needs adjusting. We'll try 10 repeats this time; effectively, the robot needs to focus more on the few images we do have. If I had less than 20 training images, I would bump the repeats to maybe 12-15.

Copilot recommended I try to get the steps above 1,000. With 10 repeats, that means we should bring up the epoch count as well. 18 epochs and 10 repeats results in 1,260 steps. Good enough.

For DIM and Alpha:

  • DIM 16 → safer, more general

  • DIM 24 → more detail, slightly more risk of overfitting

  • DIM 8 → might underfit with low‑res data

So we'll go with 16. This will make the file size a little higher than I usually have it, but that's one of the downsides of minimal training data!

SUMMARY

All of this now gives us the following (anything I don't mention, leave as-is):

  • Epochs -- 18

  • Repeats -- 10

  • Batch size -- 4

  • Resolution -- 1024

  • Enable Bucket -- Sure, why not

  • Shuffle tags -- Sure, why not

  • Flip Augmentation -- Sure, why not

  • Keep tokens -- 1

  • Clip Skip -- 2

  • Network DIM -- 16

  • Network Alpha -- 16

  • Noise Offset -- 0.05

  • Optimizer -- Prodigy

Anyway, I think that's the meat & potatoes of what you need. Check back as I'll be garnishing the article with more as I learn. Please look forward to it! Thanks for reading.

1