Sign In

LoRA Training Guide: SDXL | Pony | Illustrious [Kohya_SS]

4

System Requirements:

This guide requires at least 12GB Vram or Higher. I have not tested it with 8GB VRAM configurations.

Tested Hardware:

  • RTX 3060 12GB - Training works but is noticeably slower

  • RTX 5070 Ti 16GB - Smooth training experience with faster processing times

The performance difference between 12GB and 16GB VRAM is expected and primarily affects training speed rather than capability.


Note: This guide does not cover training for Wan 2.2 or Z-Image Turbo models Lora, as excellent tutorials already exist for these:


This article may be a bit lengthy, but rest assured, it is well worth your time. I will also assume that you are already familiar with the process of installing Kohya_ss locally.

If you Don't know how to install kohya_ss locally: Click-Here


What You Need to Get Started:

1. Base Model Checkpoint

Download the .safetensors file for your chosen base model:

2. Kohya_ss Installation

I recommend using bmaltais's version of Kohya_ss, which includes a user-friendly GUI:

3. Training Dataset

Prepare images of the character, concept, or style you want to train:

  • Recommended: 50 images minimum

  • Image dimensions: 1024x1024 pixels (1:1 ratio)

  • Bulk cropping tool: Birme - excellent for batch processing multiple images to the correct dimensions

4. Configuration File

Download the Kohya_ss config file attached to this guide to ensure optimal training settings.


-> Run Kohya_ss:

  • Download the configuration files and move them to the following directory:
    \kohya_ss\presets\lora\user_presets

  • After moving the configuration file, run gui.bat if you're on Windows. For Mac/Linux users, run gui.sh instead.

  • Once the program is running, copy the local host URL:
    http://127.0.0.1:7860

  • Paste the URL into your browser to open the Kohya GUI.

-> Load Config Files:

Paste the path of the downloaded configuration file into the appropriate Box. In My case, the path is: C:\Users\CSP\Downloads\kohya_ss\presets\lora\user_presets

Once the configuration file is in place, click on the blue arrow to load the configuration, as shown in the image.


-> Prepare DataSet:

Important Folder Structure for Kohya Training

Kohya follows a strict folder structure, and placing files in the wrong folders can prevent training from starting or lead to failed training attempts.

  1. Instance Prompt:
    The instance prompt should be a short, easy-to-remember word that will activate the LORA model. Keep it simple and concise.

  2. Class Prompt:
    The class prompt defines what you're training, such as a person (character), style (look and feel), or concept (objects, clothing, etc.).

  3. Training Images:
    These should be properly captioned images used for training the model. Ensure that they are correctly formatted and labeled.

  4. Repeats:
    The number of times the same image should be repeated within one epoch. For example, if you have 50 images and set repeats to 4, this will result in 200 steps within one epoch. It is recommended to leave this value at 4 repeats for consistency. I also recommend to keep the the training images to 50

  5. Regularization Images:
    These are typically used for Dreambooth and are not needed for LORA training. Leave this field empty.

  6. Destination Folder:
    This is the most critical folder in the training process. The destination folder is where Kohya will prepare and store the dataset for training. Ensure that this folder is correctly defined, as it will be used throughout the entire training process.

Next Steps:

  1. Click "Prepare Training Dataset" to generate the dataset.

  2. Click "Copy Info to Respective Fields" to automatically paste the dataset path into the corresponding fields. This step is crucial as it ensures accuracy and saves time by avoiding manual entry and preventing path errors.

Destination Folder shall look like:


(Optional) Captioning Your Images : For those who wish to caption their images, follow these steps. However, if you are already familiar with image captioning, you may skip this section.

Click on the upper tab : Utilities > Captioning > WD14 Captioning
You only need to change 3 informations there :
- Image folder to caption (where the images files are located)
- Undesired tags (1girl, solo, what you don't want the WD14 to use as tags)
- Prefix to add to WD14 caption (this is the tagname to call the character)

And then, click on the button on the bottom of the kohya page : "Caption Images".


Note: it can take a little while for the first time as it will download the model to tag everything, look into the command dos to know what it is doing.
I left by default the Repo ID : SmilingWolf/wd-v1-4-convnextv2-tagger-v2


-> Training Parameters:

Although the configuration file automatically sets the training parameters, there are still some critical points that require attention:

  • Select the path of the base model (PDXL | SDXL | Illustrious) or Custom Community Model

  • For 12GB of VRAM, maintain a batch size of 1-2. For 16GB of VRAM or Higher, use a batch size of 3-5.

  • I have provided you with basic prompt structure :

    score_9_up, score_8_up, score_8, 1 girl, looking at viewer, smile, parted lips, lips, standing, fullbody,  --w 1024, --h 1024, --l 4.5, --s 27, --n nsfw nude

--w : Width

--h : Hight

--l: Cgf scale

--s: Sampling steps

--n: Negative Prompt

You can customize the sampling prompt however you like to test your LoRA during training. However, I strongly recommend setting Sample every n epochs to 1 and Save model every n epochs to 1. This will generate a sample image after each epoch based on your prompt, which is essential for tracking your training progress. It helps you detect if the training is going wrong, identify when the model is getting overtrained, or determine the optimal stopping point. Often, an earlier epoch produces better results than the final one, so having samples from every epoch lets you choose the best version of your LoRA.

Click on Start Training and enjoy


-> Q/A

Q1: What tools can I use if I don't have enough VRAM?

A: If your GPU lacks sufficient VRAM, you have two excellent cloud-based alternatives. Use RunPod for LoRA and Checkpoint training, which provides powerful GPU instances on-demand. For running ComfyUI workflows in the cloud, use GoogleColab instead. You can get the Jupyter notebook for GoogleColab at my Github. Both options let you train models and generate images without investing in expensive hardware.

Q2. Is it possible to train SD3 and Flux LoRA via Kohya?

A: Yes, it is possible, but training SD3 and Flux LoRA requires a significant amount of VRAM. You'll need a GPU with high VRAM (RunPod) to handle the training.

Further More Training Flux | Wan | Z-Turbo Image with kohyaa_ss is inefficient when you have better tools like Ai-ToolKit

4