in the past few months i have noticed lots of questions about the
training process, database & parameters
So i decided to make my own article explaining that
First of all, a small clarification:
I've been using this guide for the Lora Trainer in "Civitai", so it may not be effective for others
lets start with the database:
you need to be sure which type of lora wanna train "character/art style/concept"
The quantity, diversity, and quality of images are important factors when training a model
in my case i gather at least 30 or 60 pics for a character lora & 120 for the other 2 options
The civitai trainer has a minimum pixel size of 256x256 & a maximum pixel size of 2048x2048. Any larger or smaller size will be resized to fit
for a character lora mostly of the pics need to be focus on the character itself i mean, no other character, simple background & full body view
you need to edit in case of existing text in the pics alongside with any artist name & mark
once you gather all that you can move to the tagging process:
most lora models use at least 1 trigger prompt, "prompt = tag"
if you gonna train a character model you need:
core prompt = mostly of the times is the "character name" sometimes alongside the serie/game/comic where it belong
for the art style lora usually is "artist name + style" or made by "artist name"
& for the concept loras, is more variable & complex some times need more than 1 "core prompt"
remember this simple formula for tagging
character: "character name" + "serie/game/comic" = " character core prompt"
art style: "artist name + style" = "style core prompt"
concept: in this case i will use as example my emoji lora
which constant of 2 core prompt "emoji race" & "circle head"
now with the database gathered & the tags done the easy part are the parameters
the training parameters are 2 parts
the first one is alongside the tagging process
the civitai trainer will ask you for these options:
"tag / caption" & "ignore/append/overwrite"
one trigger prompt = "core prompt"
the "max amount of tags" between 10/15 is the best amount to avoid overstimulate the trainer
the "Min Threshold" of each tag the "0.8" is the amount i use
the negative prompts are the most easy part these are the ones i use
"multi, multiple characters, multiple panels, mutation, mutated, ugly, disgusting, amputation, deformed, distorted, disfigured, deformed hands, mutated hands, extra fingers, extra limbs, watermark, signature, artist name, patreon username"
the "Prepend Tags" & lastly the "Append Tags"
these are the most important part of the tagging process
once your sure about every tag & all the gathered pics are correctly tagged you may continue for the last part
the training parameters
these parameters are my most used
"This doesn't mean they're the best, but rather that in my case they've given me better results"
epochs = 10
núm repeats = 20
train batch size = 3
resolution always "1024"
enable bucket (yes)
shuffle tags "do not use this is truly tricky to use"
keep tokens = 0
clip skip = 1
flip augmentation = 1
unet LR = 0.00050
text encoder = 0.00010
LR scheduler = cosine
LR scheduler Cycles = 3
Min SNR Gamma = 5
Network Dim = 32
Network Alpha = 16
Noise offset = 0.03
Optimizer = Prodigy
With nothing more to add, I hope this is helpful. Feel free to leave your questions and doubts in the comments; I'll try to answer them as best I can and will certainly write more articles explaining this topic in more detail


