2023.12.13: The article has been updated, and the 54M of 8dim is not the smallest, but rather 4dim.
2024.1.7: I'm really happy to train with 4dim. It's enough for single-role situations.
2024.1.20: I rewrote the article again.
Around November 2023, I proposed that 50MB is sufficient for SDXL 2D Lora, and until now, I still believe that it is unnecessary to exceed 50MB (8dim) for training SDXL 2D single characters.
Even for simple characters, 27MB (4dim) is completely sufficient.
At that time, I gave the following example:
For the character "Black Tower," I trained two Loras with different dimensions while keeping other parameters consistent:
The training model used was SDXL1.0 official model.
The testing and inference model used was Kohaku-XL beta - beta5 | Stable Diffusion Checkpoint | Civitai
Among them, "heita_32 to 8" is a process where the 32dim is reduced to 8dim using the supermerger plugin.
Plugin link: hako-mikan/sd-webui-supermerger: model merge extention for stable diffusion web ui (github.com)
It can be observed that the 8dim Lora, compared to the 32dim Lora, does not show any degradation in visual quality. In some cases, it even improves.
Keep in mind that the 32dim Lora is approximately 200MB, while the 8dim Lora is around 50MB. The two differ significantly in volume, and if the effects are similar, people would likely choose the smaller 4dim version.
Similarly, I performed tests on another model:
Again, with both 8dim and 32dim versions, it seems that the performance of the 8dim version is better (yet to be verified).
Furthermore, I trained a batch of 8dim Loras, and in one of the character Loras, I even tried fitting seven sets of clothing:
This at least indicates that the information capacity of the 8dim Lora is sufficient.
I continued the practice of training with 8dim for a long time, until one day,kitarz Creator Profile | Civitai Creator Profile | Civitai showed me a batch of Loras, where the characters were trained with 4dim. When I tested them, I found that the characters were highly restored.
Afterward, I thought that perhaps the volume of single-character Loras could be further reduced to 4dim.
On that day, I released my first 4dim Lora:
Since then, I have been using 4dim for training, resulting in Loras with a size of approximately 27MB (I trained the text encoder as well).
In recent tests, I managed to fit six characters in the 4dim space, and they were able to retain their distinctive features effectively.
Recently, jiayev1 proposed in their model introduction that an 8dim Lora is sufficient for training a realistic Lora. I find this to be an interesting viewpoint:
To all SDXL trainers, why can't we further reduce the volume of SDXL Loras, making it more convenient for our users and more disk-space friendly?
其中，heita_32 to 8是将32dim通过supermerger插件将dim值从32降到8。
8dim的训练习惯一直持续了好久，直至有一天，kitarz Creator Profile | Civitai