This article doesn't cover the basics of training.
Training:
The targets are glasses of different sizes and shapes. And loras are the water we put into them.
If you put water into a cup, it becomes the cup.
If you put water into a bottle, it becomes the bottle.
If you put water into a teacup, it becomes the teacup.
This was also a concept in the Art of War (孙子兵法). A true strategy should be based on the target. 32 dim could be the right call for a complex character, but not necessarily for a simple one, or a more complex one.
If the targets are glasses and loras are the water we put into them, we are going to check if they are straight or not, if there is a curve or not. How much water can they contain? So that we know our target.
Dataset:
The golden rule of a delicious food is always high-quality ingredients. How can one expect a good food, when it's ingredients are collected from a trash bin outside? We make sure the images in our dataset are clean and high effort. There are many, many artists drawing like shit. We don't want that.
20 good images are better than 100 low quality images. The sky is the limit for good images. We want them as many as possible.
Editing:
Images might lack certain details, have handicaps such as something blocking a character's body. Learning how to draw basic lines and remove certain parts is useful. There were many instances that the image had something I wanted, but I couldn't use it directly because it needed an edit. The checkpoints are generalist enough that you don't need artist level drawing skills.
Tagging:
Tagging is a game of words. In the game of words, we treat tags as Lego parts. That;
We can replace any parts,
we can make up any parts,
or we can remove any parts.
If a checkpoint knows what is "dynamic lighting", it can surely learn "XY lighting". Then, if we can teach "XY lighting", we can surely teach "A XY lighting" as a different variation.

