Note: I always research new ways to train, there's no set way for me, it always changes.
NNote: I'm assuming you have kohya_ss installed.
NNNote: 90% of the time I train styles, so these settings work best for me for training style Loras
Training a Lora is pretty easy if you have enough VRAM. 12GB is perfect, though I've heard you can do it with 8GB, but I reckon that would be very slow. There are only a few steps to it.
Data Gathering
Gathering data is probably the easiest, or hardest depending on what kind of data you want to gather, me personally, I use imgbrd-grabber, it's pretty easy to use, you can go check it out.
What I do is try to get images that aren't too complicated, unless of course it's something you specifically want. For simplicity, I try to avoid images that look butchered, are low quality, have too much text, or are the personification of Internet Explorer.
Then comes the next step.
Data Cleaning
For me, cleaning is even easier than getting images. I used to use Photoshop and just manually paint over the image, trying to remove as much text, watermarks, or anything I didn't want to be learned by the model. I hated using online services because I was too lazy to upload, then download, and do it all 40 times.
But then I came across IOPaint. It's way faster than using Photoshop, cleans the images perfectly using Lama Cleaner, and there's even SegmentEverything (though I don't really use that).
After cleaning the images, I put them in a nice folder inside the Kohya Image directory, where you put your images to be learned. The next step would be:
Data Organizing
Some of you might like your images to be named randomly, but I don't. I name them in order: 1, 2, 3, and so on. Of course, not manually - I use a Python script to name all the images, and along with that, it renames the folder to give it an instance name. I'm not good at naming, so why not do it automatically too?
In the same script, I have it process the folder and calculate the number of repeats and steps based on the number of images and the batch size.
Before, I used to divide by 200, then round it up. For example, if there were 23 images:
200 / 23 = 8.69, rounded up it'll be 10
I was happy with that; training usually took me 1 hour for a 130~MB Lora. What happened? Well, I'll tell you in a few steps.
Because the next step is one of the most important ones:
Data Tagging
It's highly important to tag your images correctly. I use the WD Tagger in the Kohya GUI, but I have to look at the tags and make sure they're correct; sometimes there's a tag or two that shouldn't be there. Consider using BooruDatasetTagManager - it's pretty easy to use, lightweight, and you can even autotag using it. I just use it to look at my tags instead of using notepad like I used to do before.
Settings
Now here comes the step everyone wants; that's why you're probably here. I said before that I used to get my repeats by dividing the number of images by 200, rounding it up to the closest ten (or just five). My epoch was always set to 10, 32 Network Dim, 32 Network Alpha, and my batch size was always 2.
But then I started reading how others did it and how different it was. I noticed that I can have even less Network Dim and Alpha, so I set it to 16 and 8.
But I figured, why not go even less? So with more testing, I settled on 8 Dim and 4 Alpha.
Then I came across the different types of Loras. I knew them before but didn't really want to get into them, but I decided, why not? I tried different ones: Dora, Glora, Lycoris, IA3 (this was my nightmare), and I settled with Lycoris, specifically Locon. I still had 8/4 even for conventional ones.
For optimizer? I don't want to give myself a headache and just used Prodigy; it's the simplest one for me (This gave me a major headache back when I used OneTrainer).
I mentioned how I have this script that does the impossible possible, well thanks to ZyloO, I noticed that he has something similar, so I figured out why don't I rip out borrow the code so I can get the exact number of repeats, epochs (Which is really just always mostly 7), and steps.
It goes something like this:
Processed folder: tudduls -> 20_tudd tudduls
Images in folder: 47, Successfully renamed: 47
Repeats: 20, Epochs: 7, Total steps: 3290
Batch size: 2
--------------------------------------------------
Processing complete. This window will close in 30 seconds.
Pretty neat, isn't it? But you can do that yourself; it's pretty easy. You want around 3000 steps, which before I used to settle with just 1000 ~ 1500 steps, which usually worked for me, but I noticed that people talked about how 3000 is the sweet spot, and you know, where the sheep herd goes, the other sheep must follow.
Of course, I'll upload my .json file, but note that I'm an amateur that got lucky with a GPU and is just waiting for AI to take over the world.
If you have any questions, feedback or suggestions please do leave a comment, don't forget to like and subscribe for more epic Minecraft videos!
Consider joining my Discord server, you can send your requests, find out more about upcoming loras, and find resources on training your own