UPDATED (31/08/24):
Welcome to my streamlined and updated guide on training LoRAs using AMD GPUs. This guide is specifically tailored for environments running ROCm 6.2 on the latest Ubuntu 24.02 "Noble Numbat" release, incorporating the latest PyTorch libraries.
WHAT'S NEW:
ROCm 6.2 Integration: Full support for the latest ROCm drivers, ensuring optimal performance on AMD GPUs.
Updated for Ubuntu 24.02: Tailored installation instructions for the newest Ubuntu "Noble Numbat" release.
Latest PyTorch Compatibility: Fully compatible with the most recent PyTorch version for seamless training experiences.
SETTING UP THE TRAINING ENVIRONMENT:
I wont lie, this can be a ball ache, but once done, it works and works well for the most part, even on my older 6900XT I can train a high quality lora in around 40 minutes or so (8 Epochs, 70 Images, probably overkill for an anime model using the CAME optimizer), realistic or otherwise.
Installing Derrian's LoRA Easy Training Scripts (Dev Branch)
To set up Derrian's LoRA Easy Training Scripts, especially for AMD GPUs, follow these steps. While the stable branch might work with the latest updates, we’ll use the dev branch as it's a proven method. You can experiment with the recently updated stable branch if you'd like.
Step 1: Clone the Repository (Dev Branch)
First, clone the dev branch of Derrian's LoRA Easy Training Scripts by typing the following command in your terminal:
git clone -b dev https://github.com/derrian-distro/LoRA_Easy_Training_Scripts.gitThis command downloads the necessary files from the repository into your local machine.
Step 2: Navigate to the Project Directory
After cloning the repository, switch into the newly created directory:
cd LoRA_Easy_Training_ScriptsStep 3: Initialize and Update Git Submodules
Within the project directory, you need to initialize and update the submodules:
git submodule init
git submodule updateThese commands ensure that all the necessary submodules, which are dependencies for the main project, are correctly set up.
Step 4: Manually Clone the Kohya_ss Scripts
For some reason, the Kohya scripts may not pull through, so will need to be installed manually
cd backend
git clone https://github.com/kohya-ss/sd-scripts.gitAfter cloning the Kohya_ss scripts, you'll need to perform a bit of cleanup:
Delete the empty
sd_scriptsdirectory from the backend folder that was initially created.Rename the newly cloned
sd-scriptsfolder to match the just-deletedsd_scriptsdirectory name.
This step ensures that the project directory structure is correct and that the scripts can run without issues.
Setting Up Python 3.10.14 on Ubuntu 24.02 with Pyenv
Ubuntu 24.02 "Noble Numbat" comes with Python 3.12.3 by default. However, for running Derrian's LoRA Easy Training Scripts, you need to use Python 3.10.x. As of August 2024, the most recent version of Python 3.10 is Python 3.10.14, released on March 19, 2024. This version primarily focuses on security updates and is in the "security fixes only" phase.
To set up Python 3.10.14 on your system using pyenv, follow these steps:
Step 1: Install Pyenv
Before installing pyenv, you need to install several dependencies that are required for building Python versions:
sudo apt update
sudo apt install -y \ make build-essential libssl-dev zlib1g-dev \ libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm \ libncurses5-dev libncursesw5-dev xz-utils tk-dev \ libffi-dev liblzma-dev python-openssl gitpyenv is now ready to be installed on your system, install it by running the following command:
curl https://pyenv.run | bashAfter installing pyenv, add the following lines to your shell profile (~/.bashrc, ~/.zshrc, etc.):
export PATH="$HOME/.pyenv/bin:$PATH"
eval "$(pyenv init --path)"
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"Reload your shell configuration:
source ~/.bashrc # or source ~/.zshrcStep 2: Install Python 3.10.14
With pyenv installed, you can now install Python 3.10.14 to run the training scripts:
pyenv install 3.10.14you may get errors for missing dependencies, but you should be able to safely ignore them as they are not required by the GUI or training scripts.
Step 3: Set the Local Directory to Use Python 3.10.14
Navigate to the LoRA_Easy_Training_Scripts project directory where you want to use Python 3.10.14, and set it as the local Python version:
cd /path to LoRA_Easy_Training_Scripts
pyenv local 3.10.14Make sure that pyenv is properly configured in your shell. You can verify this by checking if the pyenv command is recognized:
pyenv --versionIf this returns version number 3.10.14, then pyenv is active and functioning as expected.
Step 4: Create a Virtual Environment in the sd_scripts Directory
After setting the correct Python version, navigate to the sd_scripts folder within your project:
cd LoRA_Easy_Training_Scripts/backend/sd_scriptsCreate a virtual environment inside the sd_scripts folder using Python 3.10:
python3.10 -m venv venvThis command creates a virtual environment in the sd_scripts directory.
Activate the virtual environment with the following command:
source venv/bin/activateThe environment must be created within the sd_scripts folder. Once created, you can activate the environment from any directory by adjusting the source path, which will be relevant in later steps. Following these steps ensures that your environment is correctly set up to run Derrian's LoRA Easy Training Scripts using Python 3.10.14 on Ubuntu 24.02 "Noble Numbat".
Setting Up ROCm and PyTorch for LoRA Easy Training Scripts
To ensure optimal performance and compatibility with LoRA Easy Training Scripts, follow these steps. This guide assumes that you're using ROCm drivers version 6.1 or higher. ROCm 6.2 is recommended, though ROCm 6.1 is sufficient since it includes the necessary bits and bytes functionality.
Step 1: Ensure ROCm Drivers Are Up-to-Date
Make sure your ROCm drivers are updated to at least version 6.1. ROCm 6.2 is preferred as it includes additional enhancements, but for the context of this tutorial and training process, version 6.1 is the minimum requirement.
ROCm 6.1 and 6.2: Both versions support the latest stable PyTorch and include the required bits and bytes functionality.
Step 2: Activate the Virtual Environment
Navigate to the sd_scripts directory where your virtual environment is located, and activate it. Activating from this directory ensures that the pip requirements file is correctly located.
cd LoRA_Easy_Training_Scripts/backend/sd_scripts
source venv/bin/activateStep 3: Install the Latest Stable PyTorch with ROCm Support
With the virtual environment activated, install the latest stable version of PyTorch that supports ROCm 6.1:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.1This command installs PyTorch, torchvision, and torchaudio with support for ROCm 6.1 or higher.
Step 4: Install Required Dependencies for Kohya_ss Scripts
Next, install the necessary dependencies for the Kohya_ss scripts by running:
pip install -r requirements.txtThis installs all the required packages listed in the requirements.txt file, ensuring the scripts can run correctly.
Step 5: Uninstall Unnecessary Packages
To avoid conflicts and unnecessary packages, uninstall xformers (if found) and bitsandbytes, as these are not required when using ROCm 6.1 or higher:
pip uninstall xformers bitsandbytesThis step ensures that only the necessary packages are installed, which is crucial for avoiding compatibility issues. If bitsandbytes is installed within the venv, the scripts will not work.
Step 6: Install Backend Requirements
Finally, install the backend requirements. This command must be run from within the sd_scripts directory where the virtual environment has been activated, asit includes a relative path (..) to access the correct requirements.txt:
pip install -r ../requirements.txtThis command installs additional dependencies needed for the backend scripts, ensuring that all components of your training setup are properly configured.
Finalizing the Setup with CAME Optimizer
This section will walk you through the final steps of setting up the LoRA Easy Training Scripts, including the installation of the CAME optimizer and schedulers, adjusting the environment activation script, and making necessary edits to the optimizer's initialization function.
Step 1: Install the CAME Optimizer and Schedulers
To install the CAME optimizer and schedulers, navigate to the sd_scripts directory where your virtual environment is activated, and run the following command:
pip install ../custom_scheduler/.This command installs the CAME optimizer and related schedulers from the custom_scheduler directory.
Step 2: Install Frontend GUI Requirements
Next, navigate to the main LoRA_Easy_Training_Scripts directory and install the necessary requirements for the frontend GUI:
Navigate to the main directory:
cd LoRA_Easy_Training_ScriptsInstall the frontend requirements:
pip install -r requirements.txt
Step 3: Adjust the run.sh Script and main.py
Before running the GUI, you need to make the run.sh script executable and update it to activate the correct virtual environment:
Make the script executable:
chmod +x run.shEdit
run.sh:Open the
run.shfile in a text editor and locate the linesource venv/bin/activateand replace it withsource backend/sd_scripts/venv/bin/activateThis ensures that the correct virtual environment is activated when you run the script.Open the
main.pyfile in any editor, and adjust it to the following (just select all, then copy paste), to ensure that the back end and the GUI will be able to communicate correctly:import subprocess import time from pathlib import Path import sys import json from PySide6 import QtWidgets from qt_material import apply_stylesheet import requests from main_ui_files.MainWindow import MainWindow def run_backend(): command = "./backend/sd_scripts/venv/bin/python ./backend/main.py backend" print(f"Running command: {command}") # Run the command asynchronously try: process = subprocess.Popen(command, shell=True) except subprocess.CalledProcessError as e: print(f"Failed to start the backend: {e}") # Wait for 5 seconds to ensure the backend starts time.sleep(5) return process # Return the process object in case you need to interact with it later def CreateConfig(): return { "theme": { "location": Path("css/themes/dark_teal.xml").as_posix(), "is_light": False, } } def main() -> None: # Start the backend asynchronously before initializing the GUI backend_process = run_backend() queue_store = Path("queue_store") if not queue_store.exists(): queue_store.mkdir() config = Path("config.json") config_dict = json.loads(config.read_text()) if config.exists() else CreateConfig() if "theme" not in config_dict: config_dict.update(CreateConfig()) config.write_text(json.dumps(config_dict, indent=2)) app = QtWidgets.QApplication(sys.argv) if config_dict["theme"]["location"]: apply_stylesheet( app, theme=config_dict["theme"]["location"], invert_secondary=config_dict["theme"]["is_light"], ) window = MainWindow(app) window.setWindowTitle("LoRA Trainer") window.show() app.exec() config_dict = json.loads(config.read_text()) if not config_dict.get("run_local"): return if window.main_widget.training_thread: while window.main_widget.training_thread.is_alive(): time.sleep(5.0) requests.get(f"{window.main_widget.backend_url_input.text()}/stop_server") # Optionally, you can terminate the backend process when the GUI is closed if backend_process is not None: backend_process.terminate() if __name__ == "__main__": main()
Step 4: Edit the CAME Optimizer Initialization
If you plan to use the CAME optimizer, which is recommended, you'll need to modify its initialization function to properly initialize the step counting variable.
Navigate to the CAME optimizer file: The file you need to edit is likely located in the virtual environment's site-packages directory:
cd /path/to/your/LoRA_Easy_Training_Scripts/backend/sd_scripts/venv/lib/python3.10/site-packages/LoraEasyCustomOptimizerEdit
came.py: Open thecame.pyfile and locate the__init__function. Add the following line to initialize the_step_countattribute:def __init__( self, params: PARAMETERS, lr: float = 2e-4, betas: BETAS = (0.9, 0.999, 0.9999), weight_decay: float = 0.0, weight_decouple: bool = True, fixed_decay: bool = False, clip_threshold: float = 1.0, ams_bound: bool = False, eps1: float = 1e-30, eps2: float = 1e-16, ): # Set the _step_count attribute during initialisation # by adding this line: self._step_count = 0 self.validate_learning_rate(lr) self.validate_betas(betas) self.validate_non_negative(weight_decay, "weight_decay") self.validate_non_negative(eps1, "eps1") self.validate_non_negative(eps2, "eps2") self.clip_threshold = clip_threshold self.eps1 = eps1 self.eps2 = eps2
Final Steps
Once all the steps above are completed, your training GUI should work seamlessly with the CAME optimizer. If you encounter any issues with the TOML file, consider adjusting the dataset directory as a potential workaround. Keep an eye out for updates that may offer more concrete solutions to this issue. I am currently further exploring this issue.
___still updating using the gui with the scripts, but is pretty straightforward, if you got this far I'm sure you will manage :)
Training Settings:
People like to say when it comes to settings, everything is different, its trial and error etc. I disagree. I will update for realistic models but for Pony/Anime/styles and concept models I have used exclusively the same settings with success, the VAE can be downloaded on CIVITAI if you don't already have it separately ...These should work fine on any new AMD card with at least 12gb Memory:
This section is still under development, however, most people coming here may be looking for training settings that go with the optimiser so here are mine. Just copy the box, then put in a text file and call it *.toml, load it in the gui and adjust the filenames accordingly (These settings are still current as of 31/08/2024):
[[subsets]]
caption_extension = ".txt"
image_dir = "/image_dir"
keep_tokens = 1
name = "dataset"
num_repeats = 3
shuffle_caption = true
[train_mode]
train_mode = "lora"
[general_args.args]
max_data_loader_n_workers = 1
persistent_data_loader_workers = true
pretrained_model_name_or_path = "/PonyDiffusionV6XL.safetensors"
sdxl = true
no_half_vae = true
mixed_precision = "fp16" #bf16 may be faster on newer cards, I see little difference in output quality however
gradient_checkpointing = true
gradient_accumulation_steps = 1
seed = 119
max_token_length = 225
prior_loss_weight = 1.0
sdpa = true
max_train_epochs = 8
cache_latents = true
vae = "/sdxlVAE.safetensors"
[general_args.dataset_args]
resolution = 1024
batch_size = 2
#dim and alpha can be put up to 16/8 for realistic or more complex models
[network_args.args]
network_dropout = 0.1
network_dim = 8
network_alpha = 4.0
min_timestep = 0
max_timestep = 1000
[optimizer_args.args]
lr_scheduler = "cosine"
optimizer_type = "Came"
lr_scheduler_type = "LoraEasyCustomOptimizer.CustomOptimizers.Rex"
loss_type = "l2"
learning_rate = 0.0001
warmup_ratio = 0.05
unet_lr = 0.0001
text_encoder_lr = 1e-6
max_grad_norm = 1.0
min_snr_gamma = 5 #change to 0 for realistic models
[saving_args.args]
save_precision = "fp16"
save_model_as = "safetensors"
save_every_n_epochs = 1
save_last_n_epochs = 3
output_dir = "/stable-diffusion-webui-forge/models/Lora/"
output_name = "lora_name_(no_extension)"
[noise_args.args]
noise_offset = 0.0357
multires_noise_iterations = 5
multires_noise_discount = 0.25
[bucket_args.dataset_args]
enable_bucket = true
min_bucket_reso = 320
max_bucket_reso = 2048
bucket_reso_steps = 64
[network_args.args.network_args]
[optimizer_args.args.lr_scheduler_args]
min_lr = 1e-6
[optimizer_args.args.optimizer_args]
weight_decay = "0.02"GETTING STARTED (A WORK IN PROGRESS)
THE TRAINING SET
Will update this as am currently using a script i wrote that automatically crops and grabs matching images from downloaded video files, however the script needs a little more work as sometimes the tagging, grabbing and sorting is still touchy, but soon i should have a script that literally grabs all the different characters and crops them from a group of videos as input, then sorts them by getting the most different ones automatically, it also tags the images to a set threshold and tag count in the same manner as the civitai trainer, except it allows for simple tag editing and redundancy, for example, all tags related to a shirt, can be adjusted to red shirt, so t-shirt, shirt, dress shirt, red shirt, etc. will all be consolidated into a single tag. It will have options to match costumes as well, however, it is all script based as I can't be bothered writing a gui for it, if anyone wants to please feel free, I will share when its done. I use it already for my Loras which is how I created all my recent ones that have little to no fan art or art on various booru sites, but it still needs a little manual input and is currently far from idiot proof.




