About Dim and Alpha in LoRa Training

In this article, we will discuss what Dim and Alpha are, with the goal of better understanding how they work and their impact, so you can know which option would be the best for you.

This article aims to simplify these concepts so that anyone can easily understand this topic. Therefore, no technical terminology, advanced mathematics, or mentions of vectors, matrices, etc., will be used.

This is the third article discussing LoRA Training topics. If you are interested, you can dive deeper by reading the previous two:

Training the Text Encoder in LoRa: Why it Matters for Style | Civitai

About the trigger word in LoRa training | Civitai

What is Dim?

Basically, Dim is how much space we have to store the characteristics of our dataset (which also translates to a heavier LoRA in MB). Let's imagine that Dim represents boxes.

Example 1: Dim 16 Imagine we are training a Toon style with a Dim of 16. We only have 16 boxes to save information. The model will be smart and save only what is most important and frequent: line thickness, eye shape, the nose, etc. But what happens if the style is very complex? The 16 boxes will fill up quickly. If there are still important details left to learn, the model won't have anywhere to store them. It will be forced to ignore information or mix it up. This causes the LoRA to not be faithful to the style or to lose important details (this is technically known as underfitting or lack of capacity).

Example 2: Dim 256 Great, we have tons of space! But... be careful. Since our Toon style is simple, maybe we fill the essence of the style using only 50 boxes. What happens to the remaining 206 empty boxes? The model always wants to fill the boxes, so it will start saving things we don't care about: a specific pose that repeats, the background color of an image, or the exact anatomy of a photo.

We will have Overfitting. The model will have memorized your images so much that it loses flexibility. It might generate the perfect style, but suddenly all your characters appear with the same pose or background because the LoRA "learned" that this was part of the style.

What is Alpha?

If Dim represents the boxes where we store information, Alpha is how strong that information will be. To explain Alpha better, we will change the analogy from boxes to a canvas.

Alpha equals Dim: Imagine you are trying to trace a drawing using a thick, totally opaque permanent marker. Every stroke you make is marked with instant force. If your hand shakes a little, that error is permanently recorded in the drawing. The model learns both the "essence" of the character and the "trash" of the image with the same intensity. The result is a saturated style with quite a bit of visual noise.

Alpha at half the Dim: Now imagine you are using soft watercolor at 50% opacity. A single stroke is barely noticeable. For a line to be visible and strong, you have to pass the brush over the same spot several times. This is where filtering occurs. Traits that are always there (eye shape, hair) will receive many passes and become sharp and strong. But "errors" or random details (a weird background in a single image, digital noise) will only receive a soft pass and will barely be visible in the final result. By forcing the model to learn "by repetition" and not "by impact," the clean signal accumulates and the noise fades away.

Note: The examples below are made with the same noise level; the only thing that changes is the opacity of the image.

Examples:

DIM:

At first glance, we can see that the difference in style isn't much. Due to what we discussed in the theory section, our style isn't very complicated, so it takes little to learn it. In this case, we can save the extra 600MB that the LoRA weighs with 128. We can also see that the hand generated poorly, indicating that it is probably trying to replicate a hand it saw in the dataset.

Alpha:

Note: The Dim in these LoRAs is 32. (I apologize for the difference in the image compared to the Dim section; I lost some generation parameters).

In the Alpha=Dim version, we can see that there is more visual noise (which isn't necessarily bad) on the character; the dress has visible seams, wrinkles, and many more shadows than the Alpha=Dim/2 version. Also, in general, there is more contrast and shadows throughout the character, indicating it learned smaller details between images.

Conclusions

Personally, I train all my LoRAs (including characters and concepts) with 32/16 Dim/Alpha respectively. However, depending on the case, you might prefer having Alpha=Dim if you have a very good and clean dataset, or you might be interested in using a higher Dim if the style you want has small details like skin pores and such things.

The good thing is that, as we saw in the examples, playing around with these values isn't fatal, so I invite you to experiment.

If you have knowledge on this topic and see that I have made a mistake at any point, please let me know! The last thing I want is to misinform people, and if that happens, I will edit this article as soon as possible to correct the errors.

UPDATE 12/17/2025

In the comments, user n_Arno shared a great technical insight regarding the relationship between Alpha and Dim. It's worth noting that choosing Alpha/Dim values that result in clean numbers (like 16/32 = 0.5) is better for the model's internal calculations than "messy" fractions (like 1/3 = 0.3333). Using clean ratios helps avoid precision issues during training. If you want a more technical explanation, you can read the comments to see n_Arno detailed breakdown.

LoRa Configuration

Base Model: Illustrious V1.0
Repeats: 5
Epoch: 10
Steps: 990
Batch Size: 4
Clip Skip: 1
TE learning rate: 0.0005 
UNet learning rate: 0.0005
LR Scheduler: cosine_with_restarts
lr_scheduler_num_cycles: 3
Optimizer: AdamW8bit
Min SNR Gamma: 5
Noise Offset: 0.1
Multires noise discount: 0.3
Multires noise iterations: 8
Zero Terminal SNR: True
Shuffle caption: True