santa hat
deerdeer nosedeer glow
Sign In

How to train Pony/Illustrious lora with multiple costumes

How to train Pony/Illustrious lora with multiple costumes

Since I've been asked several times how to make lora with multiple costumes, I decided to go ahead and share my secret kung fu techniques, since they were not meant to be secret in the first place.

TIME CONSUMPTION

To save you some time, I'll start by saying how much time it takes to do this sort of thing to decide if it's worth your time.

Preparing data: 45 mins

About 30 mins to gather the images and about 15 to tag them manually on the system.

Training: depends on the system, CITIVAI takes about TWO hours for multicostume lora and about an hour or less for simple characters. Google collab is a bit faster since takes about one hour and a half (but the results are not as good)

In any case training time doesn't matter because you don't need to be present, once you push "train" you can go wherever you want.

WARNINGS

Let me warn you about some things, though.

First, I do lora in a sort of "mechanical process", I lack any sort of technical knowledge. If you ask me "why 2 and not 3" on settings, or "wouldn't be better to activate the X so it..." honestly... I have no idea. If you have any questions about the why of things or if changing something would improve the results, I simply will not be able to answer.

Probably there is so much I could learn and improve myself, but that's not something I aim for because of a simple reason. These lora work for the purposes I want them, so I don't give much a thought beyond that.

Second, I train most lora on civitai system BUT you can use whatever system you want, there are other sites out there that allow for training, there is also google collabs and such. But regardless of the method, you will need the dataset.

Now, for PREPARING the DATA we will use CIVITAI system, because it's FREE, it does not consume your BUZZ as long as you don't actually train, it's very visual an easy to use.

Third, this is focused on lora about multiple costumes, but I'll try to mix in other types of lora as well. Also, this guide is meant for PONY, the process for standard XL would be the same, but the number of pictures required and settings would vary a lot, since by experience standard XL needs like LOT more ferecences to give similar results.

ILLUSTRIOUS is pretty much the same than PONY with a few minor differences so this tutorial is valid for that model as well for the most part.

NUMBER OF PICTURES RECOMMENDED

First step would be gathering the pictures, and for this here is my recommended number of pictures.

TIP try to not add NSFW images, the more dressed your girls are the better, because the AI doesn't really need to learn how to undress your characters, what it needs is to learn is their costumes. So unless you are desperate for data, try to use SFW only. Don't worry, it will work for NSFW anyway if thats what you are serching for, I have a whole article about that.

EXCEPT for styles, for styles specially NSFW artist you want as many nudes as possible but for characters is better than you don't.

One character, one costume.

About 40-60 is the ideal. Just because you have 200 pictures on your folder won't make it better, pick the best 40-50 mixing as many angles as possible.

One character, few costumes (2-3)

40 of the "main" costume. 20 pictures of each additional costume for a total of 80ish

One character, one Costume (low data)

15 is the very least I've tried with what I considered satisfying results. Less than 15...nope. I mean, it may work but If that's the case I would rather make a 1.5 lora, then make some gens, pick the best ones, fix them the better I could, so I could fill the data until I had 20 images at least.

(The Eiyuden loras are made using this process using ONE single image but making all that is quite tedious)

One character multiple costumes (4+)

20 pictures for each costume. So 80-160 total.

TIP; Now this is not your chemical experiment, you can gather 15 of one costume and 22 of another, numbers are orientating.

STYLES

90-120 have given me the best results.

PRELUDE

First of all, even if you are only using the system for preparing the data, not actually training, remember to still abide to TOS, do not use images that violate the terms of service and use only images that you own the rights or were given permission by legit right owners and all that stuff. One must respect DA RULES.

Once gathered the images for commodity, there are two things you should do.

ONE:

Name all the images with the character in the same costume the same way so they are all together.

TWO:

Open a notepad, and tag the character main traits, and each one of the different costumes.

I usually start from head to feet to not forget anything.

Try to use different tags for different yet similar elements. For example, instead of "bikini" I used "white bikini" for one of the costumes, while for the other I used "black bikini", for the mecha costume I used "hair ornament" but for the vanilla costume I used "hair ribbon".

Of course this is not an exact science, it is not as if this will make it to always give exactly what you want, but it will help to keep the different costumes self-contained and at the same time it will not gravitate towards complete outfits.

Also, you don't need to tag every single piece of clothing, the idea here is each costume has its own tags associated.

WHY BOTHER: Because when you manually put the tags in the exact same order in every image for the training, later when you make the gens and prompting the tags the lora instantly understands which costume you are aiming for.

BAD TIP: Instead of this, you can simply give one command word to each costume, "ascension1" or "swimsuit1". This will put the lora on a more autopilot mode, and you won't need to tag so much, downside is the lora will always gravitate you towards one of the costumes as a whole making it less flexible and more difficult to combine elements of different costumes or combining with other character lora.

DATA PREPARATION

So, now this is done we go to CIVITAI, to "Create > Train a lora"

Remember YOU DON'T NEED TO USE CIVITAI to train the lora, you don't need buzz either since this feature is FREE, and It's for preparing the data only.

Pick the model type

Drag all the images. Since you named them properly (right?) All the images with the same costume will appear together, so just copy the right lines from the notepad and paste them in the corresponding costume and press the "+" button to add the tags.

Not the end of the world if you don't do the following step, but the better your dataset the better results, so make sure do not include missing pieces.

For example, here this first Martha doesn't wear the sarong, so simply remove the "sarong" tag on that particular image. Then again not exact science, but if you do this properly, it helps it to know how to remove/add pieces better.

Seems an arduous job, but it's just copy-paste, copy-paste a few times and delete some tags here and there. This one took like 15 minutes or so. It takes more time to save the images than tagging them manually.

Once you have done this, you want to download the data because if something goes wrong next step you would have lost 15 minutes of your life.

That covers the costumes. Now we are going to run the "auto tag" but first go back to the notepad and simply put all the tags you used on a single line without repeating them.

Then copy that line and paste it on "blacklist", make sure "append" is clicked, max tags to 25 and threshold to 0.4 (feel free to play, but I always use these)

THIS IS IMPORTANT, do not simply copy all the lines and paste it on the "blacklist", because if several tags are repeated several times it may lead to the autotag to fail. So use the notepad as I indicated. Is not a drama if you repeated one tag or two, but don't put "1girl, long hair, purple hair, blue eyes" five times on the "blacklist" box, simply because the system trends to fail if you do that.

This has been fixed it seems so now you can just copy/paste all lines on "blacklist" and the autotag wont fail.

Once you are done, click "submit" and wait.

STYLES

Just to clarify all this I mentioned is for characters with multiple costumes, and even for characters with only one costume but NOT for STYLES lora for those just dump all your images from that artist in there and run the autotag directly, you dont need to tag every image manually, that would be hell.

Once is finished, DOWNLOAD the data.

This is the data finished that you can use to train on CIVITAI or any other place you want.

Now with the data on hand, you can go to your preferred training place.

ACTUAL TRAINING

SETTINGS

First I will remind you again I lack technical knowledge, I do not WHY these or if other options are better. All I know is these work.

Regardless of the system used, I only ever touch "repeats" and "train batch size" I never touch anything else because I would not know what I'm touching.

As general rule 2 repeats 1 batch size, with the numbers of pictures I recommended above the steps should be always somewhere between 750 and 2000. (Maybe a bit more if it's 6 or 7 outfits, Martha with 9 outfits came out with 3000 steps).

Then again, not knowledgeable in technical aspects, but any more steps than those and already starts looking too burned, at least for my tastes.

One character, one costume

2 repeats PONY 3 repeats ILLUSTRIOUS

1 batch size

One character, few costumes (2-3)

2 repeats PONY 3 repeats ILLUSTRIOUS

1 batch size

One character, one Costume (low data)

This is the only one that require explanations. I try to keep the steps above 750 so with 20 pictures I usually go with

3 repeats PONY 4 repeats ILLUSTRIOUS

1 batch size

But if it's very low like 15-17 pictures I would go with

4 repeats PONY 5 repeats ILLUSTRIOUS

1 batch size

But only did that once, usually is 3r-1b

One character multiple costumes (4+)

2 repeats PONY 2 repeats ILLUSTRIOUS

1 batch size

STYLES

3 repeats PONY ??? repeats ILLUSTRIOUS

1 batch size

IMPORTANT ABOUT ILLUSTRIOUS

On my testing so far, Illustrious pretty much works the same when it comes to training, but I come with the best results by adding an extra repeat when the data amount is rather low. But I also got decent results in some cases using the exact same repeat than with PONY.

STYLES seem to not catch well on ILLUSTRIOUS so I rather don't try to give advice on that because I'm still trying to get the grasp of it myself

________________________________________________

Since I use civitai most part of the time, I'll show you how to set it on civitai specifically but remeber you can use it on any other service or method.

Then again if you are gonna use CIVITAI for the actual training abide to the TOS

Make sure to pick Pony as model, seems dumb, but I still have some really mediocre standard XL models lying around because I clicked buttons too fast.

For prompts examples I usually just paste three of the lines from the notepad we made earlier or leave it empty, it doesn't matter, It's just to look how the training is going.

TIP: DO NOT use custom models, using base pony works the best.

And for the settings itself, as I said, I do not touch anything but these settings.

If I use other system, for example one of the many google collab projects, since I have absolutely no idea of what does each thing then again I just touch these things.

Probably you may be able to get better results by touching the right settings, but then again, do not ask me, I dont know.

training_model: Ponny Diffusion V6 XL

num_repeats: 2

train_batch_size: 1

After that is just pushing the "train" button.

GENERATING IMAGES

Once the model is finished, it's ready for use. In my case, I generate the images locally, so I download the file and I use automatic 111 to generate the images.

For the generations I use as base the tags in our notepad, followed by whatever context I want for the character, followed by the quality tags.

For example:

1girl, long hair, purple hair, blue eyes, age regression, beret, shirt, apron, mittens, bandana around neck, spacecraft interior, bed, plant, white walls, holding heart-shaped chocolate, heart-shaped pupils, evil smile, flat chest, low-tied long hair <lora:Martha_XL:1>, score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up, BREAK source_anime, masterpiece

CLOSING

Aaaand, that's it. I think I added everything I wanted to say, the only thing I'm not convinced about is the format and distribution of the article since I'm not used to making tutorials, but I tried my best, I hope it helps and everything is clear enough.

I can do it, so pretty much anyone can. The most time consuming part is picking and adquiring the images the actual preparation takes like 10-15 minutes.

If you have any questions about methods or ideas and think I can help I'm always open to share.

BUT do not ask me for technical stuff, like reducing the file sizes, optimizing settings to improve lora performance, different values of different settings, because as much as I would like to help you, I HAVE NO CLUE.

79

Comments