Download
1 variant available
Z-Image 4.0 Workflow Guide
First of all, a quick thank you to the people behind ComfyUI, CivitAI, and all of the creators making nodes, checkpoints, and LoRAs, then sharing them for free. Without that generosity, this would never have become a hobby I could properly pursue, and I definitely would not be making images like this. Because of that, I’ve always tried to follow the same ethos and share my workflows and prompts whenever I can. 🙌
Also, a small disclaimer before we begin: I’m currently off work and on some fairly strong pain medication, so if anything in this post is a little wonky, that’s the reason. I’ve wanted to get this workflow out for a while, and now I finally have the time, so I’ve had ChatGPT help me pull this post together. Say hello, GPT 👋
This workflow is not finished. I’ll keep adding to it, tidying it up, and simplifying things where I can. I’m not a programmer, I currently work two jobs, and I’m applying for a third, so please be patient with me. If something doesn’t work, feel free to ask and I’ll do my best to help.
What this workflow does
At its core, this workflow combines Z-Image Base and Z-Image Turbo. Base handles the initial composition, and Turbo is used to refine the result.
This workflow includes:
Text-to-image generation
Image-to-image
Built-in ControlNets
Auto-prompting from images
Prompt enhancement
Optional post-processing
Optional upscaling with SeedVR2 and Ultimate SD Upscale
CivitAI-friendly metadata saving
Saved image data that includes the checkpoints and LoRAs used in the generation
Important first step: custom node required
To get this workflow running properly, you’ll need to install a custom node that I created. It is included in the .zip download.
The node is called mp_aspect_res_selector and it needs to be placed inside:
ComfyUI\custom_nodes
I am absolutely not a programmer. This node was made because I wanted a simple way to choose an aspect ratio, orientation, and megapixel target, then have the workflow automatically calculate sensible dimensions that play nicely with image generation. It was very much vibe-coded with ChatGPT for a practical purpose, and it’s been genuinely useful for me.
The basic idea is simple: you choose an aspect ratio, choose whether you want portrait or landscape, choose a megapixel target, and the node works out the final pixel dimensions for you. That means less messing about and a much quicker way to try different compositions.
Main workflow overview
The workflow is split into a few main sections, so once you know what each area does, it becomes much easier to use.

1. Global subgraph

The majority of variable can be access from the main workflow.
This is the heart of the workflow. This is where you load the Z-Image Base checkpoint and the Z-Image Turbo checkpoint that you want to use.
The settings in here are currently the ones that have worked best for me, but they are absolutely open to experimentation. If you like to tinker, this is one of the main places to do it.
In simple terms:
ZiB / Base does the initial composition and structure
ZiT / Turbo handles the refinement pass
The custom megapixel/aspect selector controls your output dimensions in a cleaner way
The workflow is set up to give you a strong base image, then push it further with refinement
By default, I usually work at around 2 megapixels, and for normal generation that is often more than enough.
Under the Global section are the LG Noise Injection and Post Processing sections. These are already set to values that have worked well for me, but please do experiment. Sometimes disabling something completely is just as useful as tweaking it.
2. Prompt and negative prompt
For standard use, just type your positive prompt into the green prompt node and your negative prompt into the red negative prompt node, then run the workflow. That is the quickest and simplest way to get started.
LoRAs
I’ve included two Power LoRA Loaders in the main workflow:
One for Z-Image Base
One for Z-Image Turbo
These sit above the main prompt area. Add your LoRAs there exactly as you normally would, and set your strengths to taste.
If you are someone who likes mixing style LoRAs and character LoRAs between stages, this makes that much easier.
I’ve also set things up so that the LoRAs used in the workflow are carried through into the saved image metadata, alongside the checkpoint information, which makes it much easier to keep track of how a final image was built.
Wildcard / Prompt Enhancement subgraph

When enabled, this section takes whatever you have written in the standard green positive prompt, then combines it with any extra text and/or wildcards entered into the wildcard section, and sends the result through QwenVL-Mod Prompt Enhancer.
So the flow here is basically:
Write your normal prompt
Add optional wildcard text
Optionally randomise the wildcard content
Feed the combined prompt into the prompt enhancer
Inside this section, you can choose an enhancement method from the dropdown menu. I’ve also included two custom JSON prompt templates that you can paste into the text box under the enhancement style setting if you want to build your own enhancement format.
I’ve found this especially useful when pushing prompts into a more structured JSON-style format, and it has worked particularly well for my NSFW prompting.
If you like your prompts a bit more controlled and machine-readable, this part can be very handy. If you prefer to keep things simple, you can ignore it completely and just use your normal prompt as-is.
Auto Prompt from Image

This section lets you load an image and have QwenVL-Mod generate a descriptive prompt from it.
That makes it useful for:
Reverse-engineering an existing image
Building a starting prompt from reference art
Converting illustrated or anime images into a more descriptive prompt structure
A very useful workflow here is to let the model describe the image, then copy that result and refine it manually before using it as your main generation prompt.
Image-to-Image subgraph

This section allows you to load an image and use it for img2img.
Important:
When using img2img, go to the Latent Switch node underneath and manually select input 1
For img2img runs, I also recommend reducing the denoise in the Global subgraph to around 0.8 as a starting point

From there, adjust the denoise value to suit the result you want. Lower values will preserve more of the source image. Higher values will allow the model to reinterpret it more aggressively.
One thing this works very nicely for is turning anime images into realistic images. That is one of the main reasons I built these extra tools into the workflow.
My anime-to-realistic method
Load the anime image into the Image 2 Image subgraph
Enable the Auto Prompt subgraph and load the same image there as well
Run one image to get a descriptive prompt
Copy that result into the main green prompt box
Disable the Auto Prompt section
Add this line to the top of the prompt:
Create an incredibly lifelike cinematic style realistic image that is indistinguishable from reality:Then remove any obvious references to anime, animation, cel shading, and similar wording
Run again and refine with denoise and prompt changes
From there it becomes a bit of a balancing act between denoise, prompt wording, and whether you want to enable prompt enhancement.
And if that still isn’t enough...
Enable the ControlNet subgraph.
ControlNet subgraph

Once the subgraph is enabled, enter into it and turn on the groups you want to use.
For anime-to-realistic conversions, I would usually start with:
Loader — this needs to be active every time
Depth Anything — usually my first choice
The built-in options include:
Depth Anything V3
Depth Anything V2
Canny
OpenPose
A good simple test is to use the same source image in img2img and ControlNet together, then let Depth Anything help hold the structure while Z-Image pushes it into realism.
Post Processing subgraph

This section is enabled with settings that I’ve found work well for me:
A little sharpen
A little film grain
They are there to add a bit of bite and texture to the final image, but definitely try turning them off or adjusting them to suit your own taste. Some images benefit from them more than others.
Upscaling options
SeedVR2 Upscale

This is the first optional upscale path. It includes its own post-processing as well.
I do not use it on every image, because I’m usually happy generating at 2MP, but it is there if you want to push detail further.
If you enable it, just be aware that it is doing real work and can take a while.
Ultimate SD Upscale

This is the second upscale option. Again, optional. Again, worth experimenting with.
And yes, if you enable both, the output from SeedVR2 will feed into Ultimate SD Upscale. This can absolutely give you huge results... but it can also mean leaving your PC running overnight for a single image 😅
Image saving and metadata

One thing I nearly forgot to mention, and it is actually quite an important part of the workflow, is the way the Image Saver has been set up.
I have configured it so that the saved image carries useful generation data with it, rather than just dumping out a plain image file. That means the workflow is designed to preserve important metadata for later reference, posting, and organisation.
Most importantly, I’ve made it so that the saved data includes:
The checkpoints used
The LoRAs used
Prompt-related information
Other generation settings such as dimensions and sampler data
That makes the workflow much more useful if you like to go back through older generations and remember exactly how an image was made. It also makes life easier when posting to places like CivitAI, because the generation data is much more complete and useful.
So if you are someone who likes keeping track of what model mix, LoRAs, and settings created a particular image, this should help a lot. 🙂
A small but useful quality-of-life feature in this workflow is that I’ve gone out of my way to make the saved metadata more complete, including the checkpoints and LoRAs used. That might sound like a small thing, but it makes revisiting old generations, troubleshooting results, and posting much easier.
Recommended starting workflow
Install the custom node
Load your ZiB and ZiT checkpoints
Add any LoRAs you want
Set your aspect ratio, orientation, and megapixels
Write your main prompt and negative prompt
Run a basic text-to-image test first
Then start enabling extras one at a time:
Wildcard / Prompt Enhancement
Auto Prompt
Img2Img
ControlNets
Upscaling
That way, if something breaks, you’ll have a much better idea of which section caused it.
A few final notes
There are a lot of moving parts in this workflow. I’m very aware of that.
I may not always be technical enough to solve every issue, but if you do find an obvious bug or error, please let me know and I will try to fix it when I can.
More improvements and simplifications will come over time. This is very much a living workflow rather than a finished polished product.
Most importantly: please enjoy it. I hope it helps you make cool things.
If you use it...
If you make something with this workflow, please consider submitting your creations through the workflow page using the Add Post button.

That helps show other people that this resource is being used, and it lets me see what you’ve made, which genuinely motivates me to keep improving and sharing more workflows.
Thanks for checking it out, and happy generating. ✨

