🧠 1) Planning your LoRA and its database
Define your goal clearly: is it a character LoRA, a style LoRA, or about objects/scenes?
Example goals: “anime character”, “painterly oil style”, “cyberpunk lighting”, etc.
For characters/faces, you can often get good results with 20–50 images.
For art styles or objects, you may need 50–200+ images depending on variety.
Legal and ethical warning:
Never train on private individuals or copyrighted artwork without permission.
Civitai and other platforms ban unlicensed or NSFW deepfakes.
Image quality: use sharp, clean, well-lit, and consistent images. Avoid blurry or watermarked ones.
🗂️ 2) Folder structure for your database
This is the standard layout used by Kohya, RunPod, Civitai, etc.:
my_lora_dataset/
├─ images/
│ ├─ 0001.jpg
│ ├─ 0002.png
│ └─ ...
└─ captions/
├─ 0001.txt
├─ 0002.txt
└─ captions.csv (optional alternative)
✅ Best practices:
Resolution:
SD1.5 → 512×512
SDXL → 1024×1024
Aspect ratio: keep similar ratios for better results.
Number of images: 20–100 for characters, up to 200+ for styles.
Captions: each image needs a caption describing what it contains.
Example caption (0001.txt):
mia_token smiling, 3/4 portrait, soft sunlight, wearing red jacket, cinematic bokeh background
Use a unique activation token like mia_token to identify your character in captions.
🧹 3) Preparing your images
Resize all images to your target resolution (512 or 1024).
Optional: remove backgrounds for character-focused LoRAs.
Clean duplicates, blurry, or low-quality images.
Create captions manually or semi-automatically.
SD1.5 LoRAs often use short “tag” captions.
SDXL/Flux LoRAs work better with natural language captions (short sentences).
💻 4) Training locally (if you have a GPU)
You can train LoRAs locally with Kohya_ss, Linaqruf’s trainer, or Diffusers scripts.
Requirements:
Python 3.10/3.11
NVIDIA GPU with CUDA support (12 GB+ VRAM recommended for SD1.5, 24 GB+ for SDXL)
Installed:
torch,xformers,accelerate,transformers, etc.
Example workflow:
git clone https://github.com/bmaltais/kohya_ss.git
cd kohya_ss
# create venv, install dependencies following repo instructions
Then, organize your dataset (images/ + .txt captions) and run a training command like this:
python train_network.py \
--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5 \
--train_data_dir=/path/to/my_lora_dataset/images \
--resolution=512 \
--output_dir=/path/to/output/lora_out \
--network_module=networks.lora \
--save_model_as=safetensors \
--network_dim=64 \
--network_alpha=64 \
--max_train_steps=1000 \
--train_batch_size=2 \
--mixed_precision=fp16 \
--save_every_n_epochs=1 \
--caption_extension=.txt
💡 Tips:
Use
.safetensorsfor safety and Civitai compatibility.Adjust
max_train_steps,learning_rate, andbatch_sizeaccording to VRAM.Use
--save_state/--resumeto continue interrupted runs.
☁️ 5) If you don’t have a powerful computer
You have several great cloud options:
🔹 Google Colab
Many public notebooks exist (search “Kohya LoRA Colab”).
Free tier = limited GPU/time; Colab Pro gives faster GPUs.
Steps:
Open notebook → mount Google Drive
Upload your dataset zip
Adjust parameters (model, steps, etc.)
Run cells → download
.safetensorsresult.
🔹 RunPod / Lambda / Paperspace / Vast.ai
Rent a GPU by the hour (starting from ~$0.20/hour).
Run pre-built Kohya templates.
You can upload your dataset and train as if it were local.
🔹 Civitai Training (Flux / Kohya on-site)
Civitai recently added in-platform training for LoRAs.
You upload your dataset ZIP and choose the training engine: Kohya or Flux.
Configure:
Base model
Resolution
Steps / Learning Rate
Checkpoints / Epochs
Captions type
Then Civitai handles the GPU and process automatically.
You can monitor logs and preview samples in real time.
🧩 6) Testing, adjusting, and uploading your LoRA
Test the
.safetensorsfile in your diffusion UI (Auto1111, SD.Next, etc.).Adjust LoRA weight (start around 0.7–1.0).
If it overfits (copies dataset images):
Reduce steps
Add regularization images
Increase dataset diversity
Export final model and upload to Civitai:
Add name, tags, model version, description, trigger word, and sample renders.
⚙️ 7) Common issues and practical tips
Issue Possible Fix Loss = NaN Lower learning rate, smaller batch size Overfitting Reduce training steps, add regularization Too generic Add more specific captions, reduce variety Output too random Add more images or train longer Dataset inconsistent Keep same resolution and visual theme
Other good habits:
Keep your dataset balanced and clean.
Avoid mixing art styles unless that’s your goal.
For faces, vary angles, lighting, and expressions.
✅ 8) Quick checklist before training
[ ] Clear goal (character / style / object).
[ ] Dataset organized (images + captions).
[ ] Images resized to 512 px or 1024 px.
[ ] Captions consistent, one unique trigger token.
[ ] Environment ready: local GPU, Colab, RunPod, or Civitai.
[ ] Training parameters planned: steps, lr, batch, network_dim.
[ ] Tool to test (Auto1111, ComfyUI, SDXL UI, etc.).
📚 Extra reading / tutorials
Civitai “Train a Flux LoRA” docs (check their Education or Train page)
RunPod “Civitai + RunPod LoRA training” examples
Community guides: “How to Train a LoRA (SD1.5 & SDXL)”, “Best captioning practices for Flux”.

