Sign In

Guide - Lora training Note , Log [WIP]

296
397
9
Updated: Oct 5, 2024
guide
Type
Other
Stats
397
Reviews
Published
Mar 30, 2023
Base Model
Other
Hash
AutoV2
8739C76E68
default creator card background decoration
Gtonero

Experimenting and observing changes

1. LR-Text Encoder

Information is a personal test, may not match. Please test it yourself. via LoRA weight adjustment

Sometimes it can only be trained on Unet. What influence does the Text-Encoder have on Unet now that it takes time to observe?

question

  • How important is it to TE? Compared to Unet

  • How much step training? for best results without Overfitting and Underfitting

DIM = 8 Alpha 4

example TE weight - Unet 1e-4 TE 5e-5 [x0.5]

example TE weight - Unet 1e-4 TE 1e-4 [x1]

example TE weight - Unet 1e-4 TE 2e-5 [x0.2]

example TE weight - Unet 1e-4 TE 1e-5 [x0.1]

example TE weight - Unet 1e-4 TE 3e-4 [x3]

Result https://imgur.com/Cs1As45

  • Reducing TE too much results in the creation of non-existent objects and cause damage to clothes

  • If used equal to Unet when reducing TE weight, it will result in a strange image or distorted clothing appearance.

  • TE will not result in overfitting if the value is not exceeded from Unet = *1

  • If using LR decay then Unet's 1e-4 can be used to keep the quality consistent.

Personal opinion: TE acts as an indicator of what is happening in the training image. keep the details in the picture
If this value is too high It will also pick up useless things. If it's too small, it will lack image details.

TE test results 5e-5 individual epochs
every 1 epochs = 237 steps https://imgur.com/a/SdYq1ET

  • Good in the 6 to 8 epochs or 1422 to 1896 steps

  • It can go up to 3K steps if the training image data is enough.

2. LR-Unet https://imgur.com/lVilHf9

Will change the image the most. Using too many or too few steps. This greatly affects the quality of LoRA.

Using LR unet more than usual It can cause a LoRA Style [even if it's not intended to be a Style]. This can happen when the training image is less than 100.

It was found that in 3e-4 and TE 1e-4 [x0.3] There is a chance that details will be lost.

When using TE x0.5, even if using LR-Unet 2 times higher, TE and Alpha /2 will prevent Unet from overfitting [but training too many steps can overfitting as well]

in 5e-5 White shirt tag is bad due to TE = 5e-5 causing poor tag retention.
may need training to 10 epochs

PS. Using a DIM higher than 16 or 32 might use more Unet ? [idk]

3. Train TE vs Unet Only [WIP] https://imgur.com/pNgOthy
File size - TE 2,620KB | Both 9,325KB | Unet 6,705KB

The Unet itself can do images even without a TE but sometimes the details of the outfit are worse.
both training Makes the image deformation in the model less. If you intend to train LoRA Style, only train Unet.

4. min_snr_gamma [WIP]

It's a new parameter that reduces the loss, takes less time to train.

gamma test [Training] = 1 - 20

Loss/avg

top to down - no_gamma / 20 / 10 / 5 / 2 / 1

From the experiment, the average loss is more smooth.

If Average Loss is above 0.15, use gamma = 5.

If using gamma = 5 and the average loss still exceeds 0.1, then reduce this gamma value below 5.

In this case using gamma 5 still has a mean loss of between 0.1 and is highly unstable in 500 steps.

4.1. DIM / Alpha [WIP]

?? Using less alpha or 1 will require more Unet regardless of DIM ??

4.2 Bucket [WIP]

according to the understanding displayed in CMD

Is to cut the proportions of various image sizes

by reducing the size according to the resolution setting If the image aspect ratio exceeds the specified bucket, it will be cropped. Try to keep your character as centered as possible.

4.3 Noise_offset

This setting if the trained image is too bright or too dark. set not more than 0.1

In most cases, practicing with anime images is recommended to set 0

PS. This setting will result in easier overfitting

4.4 Weight_Decay , betas

It is a parameter that is quite difficult to define. It is recommended to use between 0.01-1

betas then don't set it up

5. Optimizer [WIP]

I have to experiment more

== Adjust learning rate ==

AdamW8bit

Adam

Lion8bit [Vram use : Low]

Lion [Vram use : MID]

SGDNesterov

SGDNesterov8bit

== Auto Adjust learning rate ==

DAdaptation [Vram use : High]

DAdaptAdaGrad

DAdaptAdan

DAdaptSGD

AdaFactor

6. Scheduler [WIP]

linear, cosine, cosine_with_restarts, polynomial, constant (default), constant_with_warmup

?. LoRA training estimation

This was an ideal practice. which is difficult to happen with many factors

With too little training or high unet, the Text-Encoder doesn't get enough information and lacks detail.

With a low learning rate, it takes longer than usual. This makes overfitting very difficult. But it makes underfitting easier.

TE is responsible for storing the information of the Tag what it is in the image. and save details in the Tag
more changes Unet is different, the more data it collects ?