Huge news for Kohya GUI — Now you can fully Fine Tune / DreamBooth FLUX Dev with as low as 6 GB GPUs without any quality loss compared to 48 GB GPUs — Moreover, Fine Tuning yields better results than any LoRA training could

Config Files

I published all configs here : https://www.patreon.com/posts/112099700

Tutorials

Fine tuning tutorial in production

Windows FLUX LoRA training (fine tuning is same just config changes) : https://youtu.be/nySGu12Y05k

Cloud FLUX LoRA training (RunPod and Massed Compute ultra cheap) : https://youtu.be/-uhL2nW7Ddw

LoRA Extraction

The checkpoint sizes are 23.8 GB but you can extract LoRA with almost no loss quality — I made a research and public article / guide for this as well

LoRA extraction guide from Fine Tuned checkpoint is here : https://www.patreon.com/posts/112335162

Info

This is just mind blowing. The recent improvements Kohya made for block swapping is just amazing.

Speeds are also amazing that you can see in image 2 — of course those values are based on my researched config and tested on RTX A6000 — same speed as almost RTX 3090

Also all trainings experiments are made at 1024x1024px. If you use lower resolution it will be lesser VRAM + faster speed

The VRAM usages would change according to your own configuration — likely speed as well

Moreover, Fine Tuning / DreamBooth yields better results than any LoRA could

Installers

1-Kohya GUI accurate branch and Windows Torch 2.5 Installers and test prompts shared here : https://www.patreon.com/posts/110879657

The link of Kohya GUI with accurate branch : https://github.com/bmaltais/kohya_ss/tree/sd3-flux.1