Type | Other |
Stats | 397 |
Reviews | (296) |
Published | Mar 30, 2023 |
Base Model | |
Hash | AutoV2 8739C76E68 |
Experimenting and observing changes
1. LR-Text Encoder
Information is a personal test, may not match. Please test it yourself. via LoRA weight adjustment
Sometimes it can only be trained on Unet. What influence does the Text-Encoder have on Unet now that it takes time to observe?
question
How important is it to TE? Compared to Unet
How much step training? for best results without Overfitting and Underfitting
DIM = 8 Alpha 4
example TE weight - Unet 1e-4 TE 5e-5 [x0.5]
example TE weight - Unet 1e-4 TE 1e-4 [x1]
example TE weight - Unet 1e-4 TE 2e-5 [x0.2]
example TE weight - Unet 1e-4 TE 1e-5 [x0.1]
example TE weight - Unet 1e-4 TE 3e-4 [x3]
Result https://imgur.com/Cs1As45
Reducing TE too much results in the creation of non-existent objects and cause damage to clothes
If used equal to Unet when reducing TE weight, it will result in a strange image or distorted clothing appearance.
TE will not result in overfitting if the value is not exceeded from Unet = *1
If using LR decay then Unet's 1e-4 can be used to keep the quality consistent.
Personal opinion: TE acts as an indicator of what is happening in the training image. keep the details in the picture
If this value is too high It will also pick up useless things. If it's too small, it will lack image details.
TE test results 5e-5 individual epochs
every 1 epochs = 237 steps https://imgur.com/a/SdYq1ET
Good in the 6 to 8 epochs or 1422 to 1896 steps
It can go up to 3K steps if the training image data is enough.
2. LR-Unet https://imgur.com/lVilHf9
Will change the image the most. Using too many or too few steps. This greatly affects the quality of LoRA.
Using LR unet more than usual It can cause a LoRA Style [even if it's not intended to be a Style]. This can happen when the training image is less than 100.
It was found that in 3e-4 and TE 1e-4 [x0.3] There is a chance that details will be lost.
When using TE x0.5, even if using LR-Unet 2 times higher, TE and Alpha /2 will prevent Unet from overfitting [but training too many steps can overfitting as well]
in 5e-5 White shirt tag is bad due to TE = 5e-5 causing poor tag retention.
may need training to 10 epochs
PS. Using a DIM higher than 16 or 32 might use more Unet ? [idk]
3. Train TE vs Unet Only [WIP] https://imgur.com/pNgOthy
File size - TE 2,620KB | Both 9,325KB | Unet 6,705KB
The Unet itself can do images even without a TE but sometimes the details of the outfit are worse.
both training Makes the image deformation in the model less. If you intend to train LoRA Style, only train Unet.
4. min_snr_gamma [WIP]
It's a new parameter that reduces the loss, takes less time to train.
gamma test [Training] = 1 - 20
Loss/avg
top to down - no_gamma / 20 / 10 / 5 / 2 / 1
From the experiment, the average loss is more smooth.
If Average Loss is above 0.15, use gamma = 5.
If using gamma = 5 and the average loss still exceeds 0.1, then reduce this gamma value below 5.
In this case using gamma 5 still has a mean loss of between 0.1 and is highly unstable in 500 steps.
4.1. DIM / Alpha [WIP]
?? Using less alpha or 1 will require more Unet regardless of DIM ??
4.2 Bucket [WIP]
according to the understanding displayed in CMD
Is to cut the proportions of various image sizes
by reducing the size according to the resolution setting If the image aspect ratio exceeds the specified bucket, it will be cropped. Try to keep your character as centered as possible.
4.3 Noise_offset
This setting if the trained image is too bright or too dark. set not more than 0.1
In most cases, practicing with anime images is recommended to set 0
PS. This setting will result in easier overfitting
4.4 Weight_Decay , betas
It is a parameter that is quite difficult to define. It is recommended to use between 0.01-1
betas then don't set it up
5. Optimizer [WIP]
I have to experiment more
== Adjust learning rate ==
AdamW8bit
Adam
Lion8bit [Vram use : Low]
Lion [Vram use : MID]
SGDNesterov
SGDNesterov8bit
== Auto Adjust learning rate ==
DAdaptation [Vram use : High]
DAdaptAdaGrad
DAdaptAdan
DAdaptSGD
AdaFactor
6. Scheduler [WIP]
linear, cosine, cosine_with_restarts, polynomial, constant (default), constant_with_warmup
?. LoRA training estimation
This was an ideal practice. which is difficult to happen with many factors
With too little training or high unet, the Text-Encoder doesn't get enough information and lacks detail.
With a low learning rate, it takes longer than usual. This makes overfitting very difficult. But it makes underfitting easier.
TE is responsible for storing the information of the Tag what it is in the image. and save details in the Tag
more changes Unet is different, the more data it collects ?