This post is a expansion of my Reddit post HERE and a copy of my reddit post HERE.

And I owe credit for some of the information taken from HERE.

Introduction:

We all know Z-image has been rather troublesome to train. From convergence issues, to compatibility issues, nothing has seemed to work. Fortunately for everyone who didn't want to do a lot of testing, the community has come together and done it for you. And I myself have spent a couple weeks figuring out what seems to work. So, I will share with you what has allowed me to train awesome Z-image LoRAs and actually have them generate properly. I will keep this article brief, you can ask for details if you want, but I will keep this article only to what YOU need to know and do to train on ZiB and produce quality LoRAs.

Part 1: Training

You can find my OneTrainer config HERE. This config MUST be used with THIS fork of OneTrainer.

One of the biggest hurdles with training Z-image is a convergence issue. This issue has been predominantly solved through the use of Min_Snr_Gamma = 5. To my knowledge, this option did not exist on the default OneTrainer, which is why the fork must be used for now (though I'm sure it will be added to the main branch eventually).

The second necessary solution, which is much more commonly known is to train using the Prodigy_adv optimizer with Stochastic rounding turned on. ZiB seems to greatly dislike fp8 quantization and is generally sensitive to rounding. This solves that problem.

These are the main two additions that make the BIGGEST difference. I also find that using random weighted dropout on your concept training prompts works best, I generally use 12 textual variations, but this should be increased with larger datasets.

All these changes are already enabled in the OneTrainer config I provided. I just figured I'd outline the big changes. The config has the settings I found best and most optimized for my 3090, but I'm sure it could easily be optimized for lower VRAM.

Notes:
1. If you don't know how to add a new preset to OneTrainer, just save my config as a .json file, and place it in the "training_presets" folder.

2. If you aren't sure if you installed the right fork, check the optimizers, this fork has an optimizer called "automagic_sinkgd", which is unique to it. If you see that, you got the right fork. But don't use that optimizer, use prodigy_adv.

Part 2: Generation:

This is actually probably the BIGGER piece of the puzzle, even than training.

For those of you who are not up-to-date, it is more-or-less known that ZiB was trained further after ZiT was released. Because of this, Z Image Turbo is NOT compatible with Z Image Base LoRAs. This is obviously annoying, a distill is the best way to generate models trained on a base. Fortunately, this problem can be circumvented.

There are many distills that have been made directly from ZiB, and therefore are compatible with LoRA's. I've done most of my testing with RedCraft ZiB Distill, but in theory ANY distill will work (as long as it was distilled from the current ZiB), and now that we know this for sure, much better distills can be made. To be clear, this is NOT OPTIONAL, ZiB LoRA's, in my testing, only work on distills, not on the base itself. But even given this "Limitation", they work wonderfully.

This is a simple change, but a necessary one. ZiB LoRA's only seem to work on distills, they are frankly awful on the base model itself. But when used on Distills, and using the config provided, they simply work.

In terms of settings, I typically generate using a shift of 7, and a cfg of 1.5, but that is optimized for redcraft. Euler Simple seems to be the best sampling scheduler.

I find that generating at 2048x2048 produces better results, even though I trained at 1024x1024. Not that 1024 is bad, but Z-image handles larger images really well.

Part 3: Limitations and Considerations:

The first limitation is that, currently the distills the community have put out for ZiB are not quite as good as ZiT. They work wonderfully, don't get me wrong, but they have more potential than has been brought out at this time. I see this fundamentally as a non-issue. Now that we know this is pretty much required, we can just make some good distills, or make good finetunes and then distill them. The only problem is that people haven't been putting out distills in high quantity.

The second limitation I know if is, mostly, a consequence of the first. While I have tested character LoRA's, and they work wonderfully, there are some things that don't seem to train well at this moment. This seems to be mostly texture, such as brush texture, grain, etc. I have not yet gotten a model to learn advanced texture. However, I am 100% confident this is either a consequence of the Distill I'm using not being optimized for that, or some minor thing that needs to be tweaked in my training settings. Either way, I have no reason to believe its not something that will be worked out, as we improve on distills and training further.

Part 4: Results:

You can look at my profile to see all of my style LoRAs I've posted thus far, plus I've attached a couple images from there as examples. Unfortunately, because I trained my character tests on random E-girls, since they have large easily accessible datasets, I cant really share those here. But rest assured they produced more or less identical likeness as well. I haven't tested concepts, so Id love if someone did that test for me!

Z-Image Base Training is Solved (mostly), Here is the solution.

Introduction:

Part 1: Training

Part 2: Generation:

Part 3: Limitations and Considerations:

Part 4: Results: