First of all, this is my first article, so be gentle in the comments...
I started with SDXL and Flux.1 Dev nearly two years ago, and almost immediately got frustrated by the results. Images were, let's say, far from my expectations.
I quickly dropped SDXL and concentrated my efforts on Flux. Images were a lot better quality, so I downloaded a bunch of LoRAs, and refined my workflows to the point where I could say I was pretty satisfied with the results. But... there was always a "but"...
What infuriated me me most was distorted and deformed hands and feet. Six fingers and toes was the norm at that time, sometimes even more... It took several trials to get the picture right. Couple of LoRAs claimed to fix the issue, but none of them worked perfectly. Second thing: the Flux chin. I can recognize Flux model from a mile afar just looking at chins.
Then I discovered WAN 2.1. This model is really good! Some of my best pictures were created with this model. Wan 2.2 was even better in some opinions, but for me it was to complicated to use it for pictures only. Low noise, high noise, two versions for one thing. I tried this of course, but the idea behind those "noises" was a bit too much for me.
OK, enough history, let's fast forward to the present.
Z-Image Turbo
So here we are. Z-image came to be and it was a game changer for me. Not only it does not produce this Flux chin, it's way faster, way consistent and way, way better with hands and feet. Of course, sometimes it gives some bad results, but less often than other models.
To say the least, now I can generate an image (832✖1248) and upscale it to 1344✖2016 in less than 40s on my NVIDIA 5060 TI (after first run). It's damn fast.
But it was not always sunshine and roses. I had to put some effort to speed it up to that point. I started with hires image, thinking it would render a good result. Nope. Then I searched the internet for articles and tutorials. It took me a while but I think I am starting to get good speed vs. quality results.
I lowered the initial resolution to 832✖1248 (divisible by 16) and than upscale it to the desired resolution. That resulted in significant speed increase, but (as always) that was not all. So I got back to searching.
Now I incorporated CacheDiT Accelerator into my workflow. It almost cut the render speed in half (upscale speed is still the same as before). Funny thing about it is that more images you render, faster you get them. That's the power of caching.
Z-Image Base
Now about Z-Image Base. Of course I tried it, why wouldn't I?
It's really good, better in quality than it's little brother. But it's also terribly slow (at least on my machine). For the same settings as I use for Turbo version, it renders the image in about 3 minutes!. I suppose that's because my limited VRAM (just 16GB). Let me know in the comments how it works for you.
So, for the time being I am going to stick to Z-Image Turbo.
My workflow
For those interested, I have just published my current Z-Image workflow. It is a continuous work in progress, so expect more versions. You can download it here, or from this link: Z Image Turbo AIO powered by Ollama
All the instructions and links are on the info page. I have just recently dropped several pages layout in favor of Subgraphs to clean the obvious mess and confusion. Now there is only one page with all the essentials to get started. You just choose your models, LoRAs, resolution and prompts and press the run button.
The workflow lets you choose Z-Image Turbo AIO models. The reason I made it only for AIO versions is that it it is cleaner and ComfyUI loads Model, Text Encoders and VAE separately anyway. Feel free to download and use models of your choice if you will.
You can use it without any restrictions for whatever purpose, but it would be nice to mention me as the creator...
Thank you, all my 600+ great followers!
Happy rendering!
Zoltar358


