Sign In

Everything I Know About Lora

Everything I Know About Lora

This is an article on everything I learned about stable diffusion, but will mainly focus on Lora collection training and development. Sections will be labeled as clearly as possible to make navigation easier, If you have confidence in your knowledge you may find it suitable to skip the beginning and go straight for the "meat and potatoes" below.

Stable Diffusion, and basic terms
Stable diffusion as we all know is the very code/program we are using to develop images from your personal device no wifi or payments required. The need to know terms for usage are Checkpoint (model), Vae, and Lora.
What's a checkpoint?
The checkpoint is what is the bulk of processing images and will have an effect on images developed. The basic stable diffusion has a few models most popular being the version 1.5 if it isn't listed what original model it was based on its probably the 1.5 model. Now the original 1.5 is a realistic model made to do all kinds of things and not optimized for any of them. So it's common to use a custom checkpoint you prefer rather than raw 1.5 itself. If your an anime person (like myself) the rules apply slightly differently. All anime models use NAI (Novel AI) as a base an originally payed service that has evolved and become quite accessible for any setup. You don't tend to use NAI itself except for lora training (more on that later) but popular anime models for actually usage would be things like the anything, and abyss orange series of checkpoints.

What's a Vae?
Vae are super important and if this term does not mean anything to you, PLEASE READ!!! A Vae is the second half of any checkpoint. Without Vae your images will look washed up and bleached of color. Vae are the separated color training of a checkpoint most checkpoints have a recommended vae but vae's are interchangeable, so you can use your favorite one with a variety of different checkpoints. Just be very careful to have one active so your images don't look watered down.

note
Vae can be activated from settings sometimes a reset is necessary for it to activate, also you can go into "[info] Quicksettings list" and add it to the top of your console if using automatic1111.
What's a LORA?
A LORA is a Low Rank Adaption file it adds new concepts and terms to your AI that weren't originally built in. This is where a persons customization can really be seen. Lora's are applied at a percentage to your checkpoint to add their individual training to the checkpoint in use. Now when making images you need to be mindful of the various percentages of each LORA. You can't use a ton of lora at ":1" (=100%). Personally I try to make sure my LORA work at around 0.6 so they can mix with other LORA.
-What are the other terms like Hypernetwork, and Textual Inversion?
Hypernetwork, and Textual Inversion are similar to Lora where they are a collection of separate training data you can insert into your checkpoint. Hypernetwork follows the percent system as well, but textual inversion is bound into a single word. The textual inversion is normally a theme and doesn't require anything further after typing it into the prompt window.
What's a LoCon/LyCoris?
A LoCon is a Lora beyond conventional method file. They take longer to train but they are preferred for style type lora since they seem to be more exact. They do require additional software to use though. The Git page can be found below once downloaded you'll need to completely close stable diffusion and then open it again it should make a LyCoris tab right next to the Lora one if it's successful. All you need to do is apply it from the built extensions tab of your stable diffusion. If when trying to use it with a LoCon and the entire stable diffusion UI crashes this may be due to a bad LoCon, try redownloading it from its respected CivitAI page this happens occasionally (I was really confused when it happened to me. I thought I set it up all wrong, but it was just random error).
GitHub - KohakuBlueleaf/a1111-sd-webui-lycoris: An extension for stable-diffusion-webui to load lycoris models.
Prompts
Remember there is both negative and positive prompts. Normally most images generate fine with short prompts, but you will see things like "highres" in positive prompts. For qualities put things like "lowres" into negatives instead there are also a ton of embeddings you can use as well. Embeddings are similar to textual inversion but are normally used for negatives find your preferred mix of negative embeddings and put it into the negative prompt window or just type it yourself. In the positive prompt window if using a Lora be sure to read and use it's listed triggers if any more on those later.

How to make a Lora?
Now after reading all of this you may want to try making a LORA to make your own custom pictures if downloading other peoples was not enough for you. Good tools for a good desktop would be Kohya tools for AI training. Kohya uses your device to train a checkpoint for realistic train on SD 1.5 pruned. and for anime train on NAI pruned. If you rather not use your personal device I suggests a trainer made by hallowstrawberry in google colab.
Why the above models, and what does pruned mean?
Pruned means the checkpoint is simplified and basically made to be trained on. The reason I suggests training on the models above is for compatibility. Some checkpoints also have custom pruned variants, but if trained on those it will look worse when trying it on a different checkpoint. SD 1.5 is the genesis of all checkpoints and compatible with all realism models, the same is true for NAI and anime models and NAI can also work backwards on some realism models as well but simply put it's not optimized for it.
Where do I start?

The first thing to focus on is your concept what does your LORA do some common ideas are characters, styles, backgrounds, and actual concepts. For any of these topics though you need data or sample pics of what you are trying to teach your AI. You can collect these images from anywhere such as pixiv, twitter, danbooru, Safebooru, google, reddit....ANYWHERE. Once you have all your images you need to tag them, tagging is what's in the image in a written form for the AI to compare and learn from, its basically the prompt. Tagging can be long and tedious but I recommend an extension called "stable-diffusion-webui-wd14-tagger" To assist in the basics. You can add it using the extension tab. Once you have at least 20 images tagged and ready you can officially begin.
Dataset
The dataset of images should have a variety of angles and distances away from your target most of the time. If training a character be sure to include a batch of faceshots as well as full body and cropped images so the AI can really understand the look and how they appear overall. Datasets can be trained at different values as well so better images can have more weight in the training, this would also be how you train on separate terms in a single lora if you are designing a lora that has a trigger word or phrase (a necessary word or phrase to activate a Lora). Not all Lora need a trigger though for instance most styles can simply just be activated by lora alone. If you are making a dataset with a trigger be sure the trigger is the first word in your tag prompt list.
Training
For the many Lora types its important to remember that they are different and need to be broken down and trained separately: characters, styles, backgrounds, and actual concepts. What's most important in either Kohya, or google colab are three things Epochs, text lr, and unet lr.
Learning Rates:
Or LR are going to be main thing to adjusts and change from lora to lora like styles have a slightly lower learning rates than characters on average because they are more general. To find a good mix for your starting project I suggest simply used the same values as someone who made a similar one that you like, or you can always try try and try again until your satisfied, (but really start with a value of one that's similar that you like). You can see the info of any lora using the "sd-webui-additional-networks" extension, which can also be added in the extension tab.
Epochs
Epoch is how many time your data gets reviewed normally you would set any singular image to be reviewed about 10 times in the training steps, but by adding more epochs the AI will view the entire dataset say 10 times (I tend to use 10 epochs for everything) that's a 100 times total by the way. By using minimal Epochs you get minimal strength of your lora personally I wouldn't go any lower than 8 or any higher than 12 from my personal experience.
Publishing
Once your LORA is all made and done, download it and set it up. Generate a ton of images for your page so people can see what it is. Be sure to write a general description so people can understand what it does and how it does it no one likes a blank description. If any of your sample images use Lora's made by other people be sure to credit them in the version notes it's not required but it is polite.

Afterword
If you have any further questions be sure to message me in comments or on the discord and I'll be sure to do what I can (I'm pretty active). Also below is a series of tools and tutorials you may find useful please enjoy and I hope I helped you learn something.
Resources
How to set up SD:
Stable Diffusion Resources (rentry.co)
-How to set SD up for anime:
How to use ``Deep Danbooru'' in ``AUTOMATIC 1111 version Stable Diffusion web UI'' to find the Danbooru tag for the image generation AI prompt in the opposite direction from the illustration image Summary - GIGAZINE
--Danbooru Tag lists for anime image prompts (age warning for the site)
Tag Groups Wiki | Danbooru (donmai.us)
Training guides for further detail:
LoRA Training Guide (rentry.org)
THE OTHER LoRA TRAINING RENTRY
LAZY TRAINING GUIDE (rentry.org)
Hollowstrawberry's google colab:
kohya-colab/README.md at main · hollowstrawberry/kohya-colab · GitHub
Recommended extensions for automatic1111:
GitHub - DominikDoom/a1111-sd-webui-tagcomplete: Booru style tag autocompletion for AUTOMATIC1111's Stable Diffusion web UI
GitHub - kohya-ss/sd-webui-additional-networks
GitHub - toriato/stable-diffusion-webui-wd14-tagger: Labeling extension for Automatic1111's Web UI
GitHub - KohakuBlueleaf/a1111-sd-webui-lycoris: An extension for stable-diffusion-webui to load lycoris models.
CivitaAi discord:
https://discord.gg/UwX5wKwm6c

121

Comments