Sign In

Z-Image Dual-Stage Workflow ZIB with ZIT Refiner

Download

1 variant available

Archive Other

28.42 KB

Verified:

Type

Workflows

Stats

323

Reviews

Published

Apr 10, 2026

Base Model

ZImageBase

Hash

AutoV2
DF1D4078AD

Z-Image 4.6 Workflow Guide

 

First of all, a quick thank you again to the people behind ComfyUI, CivitAI, and all of the creators making nodes, checkpoints, LoRAs, workflows, tools, and helper scripts, then sharing them for everyone else to use. Without that community, this kind of workflow would not be possible, and it definitely would not be anywhere near as fun to build.

 

This is the updated 4.6 version of my Z-Image Dual-Stage workflow. It is still very much a living workflow. I will keep changing things, simplifying things, and trying to make the layout easier to understand over time, so please treat this as a practical guide rather than a polished technical manual.

 

The biggest change in 4.6 is the prompt system. The old Wildcard / Prompt Enhancement section has been removed, and the old Auto Prompt from Image section has also been removed. They have been replaced with a new Prompt / Prompt Builder subgraph, a new Prompt From Image loader, and a Gemma4-based Prompt Enhancer subgraph.

 

The short version is this:

 

- You can still use the workflow very simply by writing a normal prompt.

- You can open the Prompt / Prompt Builder subgraph and enable wildcard builder nodes if you want automated prompt variation.

- You can load an image into Prompt From Image if you want Gemma4 to enhance from an image.

- The Prompt Enhancer can improve either the basic text prompt or the image-guided prompt.

- The Image Saver now saves the seed used, as well as the other generation metadata.

 

What this workflow does

 

At its core, this workflow still combines Z-Image Base and Z-Image Turbo. Base handles the initial composition and structure, and Turbo is used to refine the result.

 

This workflow includes:

 

- Text-to-image generation

- Image-to-image

- Built-in ControlNets

- A new Prompt / Prompt Builder subgraph

- Sequential and random wildcard prompt tools

- Prompt enhancement using Gemma4

- Prompt enhancement from a loaded image

- Optional post-processing

- Optional upscaling with SeedVR2 and Ultimate SD Upscale

- CivitAI-friendly metadata saving

- Saved image data that includes checkpoints, LoRAs, prompt information, dimensions, sampler data, and now the seed used

 

Important first step: custom nodes required

 

There are more custom nodes in 4.6 than there were in the 4.0 workflow.

 

Some custom node packs still need to be installed through ComfyUI Manager or manually cloned into your ComfyUI/custom_nodes/ folder:

 

- rgthree-comfy: https://github.com/rgthree/rgthree-comfy

- ComfyUI-Image-Saver: https://github.com/alexopus/ComfyUI-Image-Saver

- ComfyUI-Impact-Pack: https://github.com/ltdrdata/ComfyUI-Impact-Pack

- ComfyUI-LG_SamplingUtils: https://github.com/LAOGOU-666/ComfyUI-LG_SamplingUtils

- ComfyUI_essentials: https://github.com/cubiq/ComfyUI_essentials

- ComfyUI-Easy-Use: https://github.com/yolain/ComfyUI-Easy-Use

- ComfyUI-KJNodes: https://github.com/kijai/ComfyUI-KJNodes

- RES4LYF: https://github.com/ClownsharkBatwing/RES4LYF

- ComfyUI ControlNet Aux: https://github.com/Fannovel16/comfyui_controlnet_aux

- ComfyUI-DepthAnythingV3: https://github.com/PozzettiAndrea/ComfyUI-DepthAnythingV3

- ComfyUI-SeedVR2_VideoUpscaler: https://github.com/numz/ComfyUI-SeedVR2_VideoUpscaler

- ComfyUI-Unload-Model: https://github.com/SeanScripts/ComfyUI-Unload-Model

- ComfyUI_UltimateSDUpscale: https://github.com/ssitu/ComfyUI_UltimateSDUpscale

- WAS Node Suite - Revised: https://github.com/ltdrdata/was-node-suite-comfyui

- comfyui-vrgamedevgirl: https://github.com/vrgamegirl19/comfyui-vrgamedevgirl

 

These custom nodes are included with the workflow package, so they should be inside the zip file and do not need a separate download:

 

- ComfyUI-SequentialWildcardPrompt

- ComfyUI-WildcardPromptAssembler

- ComfyUI-WildcardPromptToolkit

- mp_aspect_res_selector

 

Place the included custom node folders into:

 

ComfyUI/custom_nodes/

 

For a Windows portable install, that is usually:

 

ComfyUI_windows_portable/ComfyUI/custom_nodes/

 

After installing or copying the custom node folders, restart ComfyUI or refresh the node list.

 

Also, the Film Grain and Sharpen nodes in this workflow are provided by comfyui-vrgamedevgirl, so if those nodes are missing, that is the pack to check first.

 

Main workflow overview

 

 

The main workflow overview has changed slightly in 4.6. The big visible change is that the old prompt-related sections have been replaced by the new Prompt From Image loader, the new Prompt / Prompt Builder node, and the Gemma4 Prompt Enhancer node.

 

The workflow is still laid out so that the main controls are accessible from the front page:

 

- Image Saver

- ZiB LoRA Loader

- ZiT LoRA Loader

- Prompt From Image

- Prompt / Prompt Builder

- Prompt Enhancer

- Negative Prompt

- Image 2 Image

- Latent Switch

- Global

- LG Noise Injection

- Post Processing

- Optional upscaling sections

 

For normal use, you can start from the main workflow page. You do not need to dive into every subgraph straight away. I would strongly recommend running a simple text-to-image test first, then enabling the extra sections one at a time.

 

1. Global subgraph

 

 

The Global subgraph is still the heart of the workflow. This is where the main Z-Image Base and Z-Image Turbo models are loaded, and where the diffusion settings, aspect ratio, dimensions, seed handling, scheduler, VAE, and latent setup all come together.

 

In simple terms:

 

- ZiB / Base handles the first composition pass.

- ZiT / Turbo handles the refinement pass.

- The mp_aspect_res_selector node controls aspect ratio, orientation, megapixels, and aligned output dimensions.

- The seed generator now feeds the Image Saver properly, so the saved file records the seed that was actually used.

- The workflow is set up to create a strong base image, then push it through a refined second stage.

 

The aspect selector is still one of the most useful quality-of-life parts of the workflow. You can choose an aspect ratio, portrait or landscape orientation, a target megapixel value, and the node works out sensible dimensions for you.

 

By default, I usually work at around 2 megapixels for normal generation. That gives a good balance between quality and speed, and it leaves the optional upscaling paths available if you want to push the image further later.

 

The Global section also includes the VAE switch, tiled VAE decode, sampler, scheduler, split sigmas, sigma resampling, and cache/VRAM cleanup nodes. The default values are the ones that have worked best for me so far, but this is absolutely a place where people who like to tinker can experiment.

 

One small but important 4.6 change: image saving now includes the seed used. In older versions, the saved metadata was useful, but the seed handling was not as complete as I wanted. This version is much better for going back through older generations and understanding exactly how an image was made.

 

2. Prompt and negative prompt

 

The standard positive prompt flow is now handled through the Prompt / Prompt Builder section.

 

For simple use, just type your prompt into the main Prompt / Prompt Builder prompt field and run the workflow. You do not have to enable the wildcard builder nodes. If you want a normal, controlled prompt workflow, keep it simple and use it like a regular prompt box.

 

The negative prompt is still handled separately in the red Negative Prompt node. I have included a standard negative prompt note in the workflow as a starting point, but you should adjust it to suit your own style and subject matter.

 

3. Prompt / Prompt Builder subgraph

 

 

This is one of the biggest changes in 4.6.

 

The Prompt / Prompt Builder subgraph can be used in two ways:

 

- As a simple prompt box

- As a wildcard prompt builder

 

If you only want to write one prompt manually, you can ignore most of the inside of the subgraph, just make sure the 6 wildcard rule builder nodes are bypassed. Type the prompt in via the main graph, run the workflow, and carry on as normal.

 

 

If you want the workflow to build prompts for you, enter the subgraph and enable the Wildcard Rule Builder nodes you want to use. The included example is set up around character, outfit, pose, wings, framing, and camera angle style wildcard files, but the same system can be used for your own subjects, styles, locations, lighting, camera terms, character details, or anything else that works well as a line-by-line wildcard list.

 

The prompt builder is made from three main parts:

 

- Wildcard Rule Builder nodes

- Wildcard Config Combiner

- Wildcard Prompt Assembler

 

Wildcard Rule Builder nodes

 

 

Each Wildcard Rule Builder node controls one wildcard source.

 

The important fields are:

 

- token_name

- wildcard_file

- mode

- start_line_number

- repeat_each_line

 

token_name is the name of that wildcard rule. In the included setup, the example token names include prompt, outfit, pose, wings, framing, and camera_angle.

 

Write a base prompt, that uses the token_names, so as an example;

{girl} is wearing {outfit}  in {location}

Where girl, outfit and location are all the token names you have assigned to your wildcards.

 

wildcard_file is the text file the node reads from. Each line in the selected file is treated as an available option.

 

mode controls how the wildcard moves across workflow executions. In the included workflow, some nodes are set to next, and others are set to randomize.

 

next is useful when you want to step through a wildcard file in order. For example, you might want to move through a base prompt list one line at a time.

 

randomize is useful when you want variety. For example, poses, framing, camera angles, lighting ideas, locations, or style details can work well when they are randomly selected.

 

start_line_number is the starting line in the wildcard file. It is 1-based, so line 1 means the first line in the file.

 

 

repeat_each_line controls how many executions the workflow stays on one line before changing. If this is set to 1, it can change every run. If it is set higher, the same line will be repeated for several generations before moving on.

 

 

The Rule Builder also has a rule_preview output. This is useful for checking what the rule is doing before you rely on it for a larger batch.

 

Wildcard Config Combiner

 

The Wildcard Config Combiner collects the enabled Rule Builder outputs into one combined wildcard configuration.

 

In this workflow, it has six rule inputs:

 

- rule_1

- rule_2

- rule_3

- rule_4

- rule_5

- rule_6

 

You do not need to use all six. If you only want one or two wildcard sources, enable only those. If you want a more complex prompt recipe, enable more of them.

 

Wildcard Prompt Assembler

 

The Wildcard Prompt Assembler is where the final prompt is assembled.

 

It takes the wildcard configuration from the combiner, combines it with the prompt text, resolves the selected wildcard values, and outputs the finished prompt.

 

The important outputs are:

 

- prompt_out

- preview_text

- selected_values

- resolved_token_count

 

prompt_out is the final prompt that leaves the Prompt / Prompt Builder subgraph.

 

preview_text is the one to watch while testing, because it shows the run counter, the original prompt, the resolved values, and the final prompt.

 

selected_values is useful if you want to see exactly which wildcard values were chosen.

 

resolved_token_count is useful for checking whether the assembler actually found and resolved wildcard tokens.

 

The assembler also includes a run_counter. This matters for sequential wildcard behaviour. If the counter is set to increment after generation, the sequential rules can move forward each run. If you reset or change the counter, the sequence position changes as well.

 

Example wildcard setup in this workflow

 

 

The included 4.6 example uses these wildcard sources:

 

- prompt from a baseline prompt wildcard file, set to next

- outfit from a lingerie wildcard file, set to next

- pose from a pose wildcard file, set to randomize

- wings from a succubus wings/horns/tail wildcard file, set to randomize

- framing from a framing wildcard file, set to randomize

- camera_angle from a camera angle wildcard file, set to randomize

 

That is only an example. The real value of the system is that you can swap the wildcard files and token names for your own setup.

 

A good way to test it is:

 

1. Enable one Rule Builder node.

2. Run the workflow once.

3. Check the preview text.

4. Enable a second Rule Builder node.

5. Check the preview again.

6. Build up from there.

 

That way, if the prompt does something unexpected, you know which wildcard rule caused it.

 

4. Prompt From Image and Prompt Enhancer

 

 

The old Auto Prompt from Image section has been removed and replaced.

 

In 4.6, Prompt From Image is now a loader that works together with the Prompt Enhancer subgraph. This is important: the image loader on its own is not really the full feature. It is meant to feed the Prompt Enhancer.

 

The Prompt Enhancer uses Gemma4 with a prompt enhancing command I wrote. When enabled, it can enhance either:

 

- The basic prompt coming from the Prompt / Prompt Builder subgraph

- The loaded image from the Prompt From Image node, if an image is loaded and the enhancer is being used with that image input

 

So the flow is basically:

 

1. Write a simple prompt or build one with the Prompt Builder.

2. Or, Optionally load an image into Prompt From Image.

3. Let Gemma4 enhance the prompt or image-guided idea.

4. Send the enhanced prompt onward to the positive prompt conditioning.

 

If you do not load an image, Gemma4 works from the text prompt. If you do load an image, it can use the image as part of the prompt enhancement process. If you are prompting from an image, leave the text prompt blank.

 

The Prompt Enhancer subgraph includes:

 

- A Gemma4 CLIP loader

- The TextGenerate node

- The main enhancement prompt

- Preview output for the enhanced prompt

- Markdown notes with example enhancement commands

- A model storage note for the Gemma4 text encoder

 

The required Gemma4 text encoder is:

 

gemma4_e4b_it_fp8_scaled.safetensors

 

It should be placed here:

 

ComfyUI/models/text_encoders/gemma4_e4b_it_fp8_scaled.safetensors

 

The workflow note also links to:

 

- https://huggingface.co/Comfy-Org/gemma-4/blob/main/text_encoders/gemma4_e4b_it_fp8_scaled.safetensors

- https://huggingface.co/Comfy-Org/gemma-4/tree/main/text_encoders

 

For the CLIP loader inside the Prompt Enhancer subgraph, the settings should be:

 

- type: stable_diffusion

- device: default

 

The enhancement prompt is currently written to produce a structured cinematic Z-Image prompt with labelled sections:

 

- CONCEPT & MEDIUM

- SUBJECT DESCRIPTION

- ACTION & INTERACTION

- APPAREL

- ENVIRONMENT & FOREGROUND

- BACKGROUND & LIGHTING

 

This is designed to take a simple idea and turn it into something more detailed and usable by the image model.

 

I have also included extra example prompt-enhancement commands in the workflow as markdown notes, so you can copy, edit, or replace them if you want to experiment with different enhancement styles.

 

5. LoRAs

 

The workflow still includes two Power LoRA Loaders in the main workflow:

 

- One for Z-Image Base

- One for Z-Image Turbo

 

These sit above the main prompt area. Add your LoRAs there exactly as you normally would, and set the strengths to taste.

 

If you like mixing style LoRAs and character LoRAs between stages, this setup makes that much easier.

 

The workflow is also set up so the LoRAs used in the generation are carried through into the saved image metadata, alongside the checkpoint information and the prompt data.

 

6. Image-to-Image subgraph

 

The Image-to-Image section is still there for img2img.

 

Important:

 

- When using img2img, go to the Latent Switch node underneath the main prompt area.

- Select input 1 for img2img.

- Input 2 is the empty latent path.

- When using img2img, reduce the denoise value in the Global subgraph.

 

As a starting point, I usually recommend trying denoise around 0.8, then adjusting from there.

 

Lower denoise values preserve more of the source image. Higher denoise values allow the model to reinterpret the source image more aggressively.

 

One useful workflow is to load the same reference image into Image 2 Image and Prompt From Image, then let the Prompt Enhancer create a better prompt from the image while img2img carries the structure through the diffusion process.

 

7. My anime-to-realistic method

 

This is still one of the things I enjoy using the workflow for.

 

The new 4.6 method is:

 

1. Load the anime image into the Image 2 Image subgraph.

2. Set the Latent Switch to input 1.

3. Load the same image into Prompt From Image.

4. Enable/use the Prompt Enhancer.

5. Let Gemma4 create an enhanced prompt from the image.

6. Remove or edit any wording that pushes the result back toward anime, illustration, cel shading, or cartoon styling.

7. Add a realism instruction if needed, such as:

 

Create an incredibly lifelike cinematic style realistic image that is indistinguishable from reality:

 

8. Run again and refine with denoise, ControlNet, prompt wording, and enhancement settings.

 

If the structure drifts too much, enable the ControlNet subgraph and start with Depth Anything.

 

8. ControlNet subgraph

 

The ControlNet section is still optional, but it can be very useful when you want to preserve structure.

 

Once the subgraph is enabled, enter it and turn on the groups you want to use.

 

For anime-to-realistic conversions, I would usually start with:

 

- Loader, which needs to be active every time

- Depth Anything, usually my first choice

 

The built-in options include:

 

- Depth Anything V3

- Depth Anything V2

- Canny

- OpenPose

 

A good simple test is to use the same source image in img2img and ControlNet together, then let Depth Anything help hold the structure while Z-Image pushes the image toward the new style.

 

9. Post Processing subgraph

 

The post-processing section is still enabled with settings that have worked well for me:

 

- A little sharpen

- A little film grain

 

They are there to add a bit of bite and texture to the final image, but definitely try turning them off or adjusting them to suit your own taste.

 

Some images benefit from them. Some images look better without them.

 

10. Upscaling options

 

There are still two optional upscaling paths.

 

SeedVR2 Upscale

 

SeedVR2 is the first optional upscale path. It includes its own post-processing as well.

 

I do not use it on every image, because I am usually happy generating at around 2MP, but it is there if you want to push detail further.

 

If you enable it, just be aware that it is doing real work and can take a while.

 

Ultimate SD Upscale

 

Ultimate SD Upscale is the second optional upscale option.

 

Again, it is optional. Again, it is worth experimenting with.

 

If you enable both upscalers, the output from SeedVR2 feeds into Ultimate SD Upscale. This can produce very large final results, but it can also mean leaving your PC running for a long time on one image.

 

11. Image saving and metadata

 

The Image Saver setup is one of the things I care about most in this workflow.

 

The saved image data is designed to preserve useful generation information rather than just outputting a plain image file.

 

In 4.6, the saved metadata is set up to include:

 

- Checkpoints used

- LoRAs used

- Prompt-related information

- Image dimensions

- Sampler and scheduler information

- The seed used

 

That last one is the important update. The Global subgraph now sends the seed through so the Image Saver can save the seed used by the generation.

 

This makes the workflow more useful if you like going back through older images and working out exactly how they were made. It also makes posting to CivitAI easier, because the generation data is more complete.

 

Recommended starting workflow

 

If you are loading the workflow for the first time, I would suggest this order:

 

1. Install the required custom node packs.

2. Copy the included custom node folders into ComfyUI/custom_nodes/.

3. Add the Gemma4 text encoder to ComfyUI/models/text_encoders/ if you want to use Prompt Enhancer.

4. Restart ComfyUI.

5. Load the workflow.

6. Load your ZiB and ZiT checkpoints.

7. Add any LoRAs you want.

8. Set your aspect ratio, orientation, and megapixels in the Global subgraph.

9. Write a simple prompt in Prompt / Prompt Builder.

10. Write or adjust your negative prompt.

11. Run a basic text-to-image test first.

 

After that, enable extras one at a time:

 

- Prompt Enhancer

- Prompt From Image

- Wildcard Prompt Builder nodes

- Img2Img

- ControlNets

- Post Processing

- SeedVR2 Upscale

- Ultimate SD Upscale

 

That way, if something breaks, you will have a much better idea of which section caused it.

 

A few final notes

 

There are a lot of moving parts in this workflow. I know that.

 

The goal of 4.6 was to make the prompting side more flexible while still keeping the simple prompt workflow available. You do not need to use the wildcard builder or prompt enhancer every time. They are there when you want more automation, more variation, or a stronger starting prompt from an image.

 

If you find an obvious bug or something confusing in the layout, please let me know and I will try to fix it when I can.

 

More improvements and simplifications will come over time. This is still a living workflow rather than a finished polished product.

 

Most importantly: please enjoy it. I hope it helps you make cool things.

 

If you use it...

 

If you make something with this workflow, please consider submitting your creations through the workflow page using the Add Post button.

 

That helps show other people that the resource is being used, and it lets me see what you have made, which genuinely motivates me to keep improving and sharing more workflows.

 

Thanks for checking it out, and happy generating.


Z-Image 4.0 Workflow Guide

First of all, a quick thank you to the people behind ComfyUI, CivitAI, and all of the creators making nodes, checkpoints, and LoRAs, then sharing them for free. Without that generosity, this would never have become a hobby I could properly pursue, and I definitely would not be making images like this. Because of that, I’ve always tried to follow the same ethos and share my workflows and prompts whenever I can. 🙌

Also, a small disclaimer before we begin: I’m currently off work and on some fairly strong pain medication, so if anything in this post is a little wonky, that’s the reason. I’ve wanted to get this workflow out for a while, and now I finally have the time, so I’ve had ChatGPT help me pull this post together. Say hello, GPT 👋

This workflow is not finished. I’ll keep adding to it, tidying it up, and simplifying things where I can. I’m not a programmer, I currently work two jobs, and I’m applying for a third, so please be patient with me. If something doesn’t work, feel free to ask and I’ll do my best to help.


What this workflow does

At its core, this workflow combines Z-Image Base and Z-Image Turbo. Base handles the initial composition, and Turbo is used to refine the result.

This workflow includes:

  • Text-to-image generation

  • Image-to-image

  • Built-in ControlNets

  • Auto-prompting from images

  • Prompt enhancement

  • Optional post-processing

  • Optional upscaling with SeedVR2 and Ultimate SD Upscale

  • CivitAI-friendly metadata saving

  • Saved image data that includes the checkpoints and LoRAs used in the generation


Important first step: custom node required

To get this workflow running properly, you’ll need to install a custom node that I created. It is included in the .zip download.

The node is called mp_aspect_res_selector and it needs to be placed inside:

ComfyUI\custom_nodes

I am absolutely not a programmer. This node was made because I wanted a simple way to choose an aspect ratio, orientation, and megapixel target, then have the workflow automatically calculate sensible dimensions that play nicely with image generation. It was very much vibe-coded with ChatGPT for a practical purpose, and it’s been genuinely useful for me.

The basic idea is simple: you choose an aspect ratio, choose whether you want portrait or landscape, choose a megapixel target, and the node works out the final pixel dimensions for you. That means less messing about and a much quicker way to try different compositions.


Main workflow overview

The workflow is split into a few main sections, so once you know what each area does, it becomes much easier to use.

1. Global subgraph

The majority of variable can be access from the main workflow.

This is the heart of the workflow. This is where you load the Z-Image Base checkpoint and the Z-Image Turbo checkpoint that you want to use.

The settings in here are currently the ones that have worked best for me, but they are absolutely open to experimentation. If you like to tinker, this is one of the main places to do it.

In simple terms:

  • ZiB / Base does the initial composition and structure

  • ZiT / Turbo handles the refinement pass

  • The custom megapixel/aspect selector controls your output dimensions in a cleaner way

  • The workflow is set up to give you a strong base image, then push it further with refinement

By default, I usually work at around 2 megapixels, and for normal generation that is often more than enough.

Under the Global section are the LG Noise Injection and Post Processing sections. These are already set to values that have worked well for me, but please do experiment. Sometimes disabling something completely is just as useful as tweaking it.

2. Prompt and negative prompt

For standard use, just type your positive prompt into the green prompt node and your negative prompt into the red negative prompt node, then run the workflow. That is the quickest and simplest way to get started.


LoRAs

I’ve included two Power LoRA Loaders in the main workflow:

  • One for Z-Image Base

  • One for Z-Image Turbo

These sit above the main prompt area. Add your LoRAs there exactly as you normally would, and set your strengths to taste.

If you are someone who likes mixing style LoRAs and character LoRAs between stages, this makes that much easier.

I’ve also set things up so that the LoRAs used in the workflow are carried through into the saved image metadata, alongside the checkpoint information, which makes it much easier to keep track of how a final image was built.


Wildcard / Prompt Enhancement subgraph

When enabled, this section takes whatever you have written in the standard green positive prompt, then combines it with any extra text and/or wildcards entered into the wildcard section, and sends the result through QwenVL-Mod Prompt Enhancer.

So the flow here is basically:

  1. Write your normal prompt

  2. Add optional wildcard text

  3. Optionally randomise the wildcard content

  4. Feed the combined prompt into the prompt enhancer

Inside this section, you can choose an enhancement method from the dropdown menu. I’ve also included two custom JSON prompt templates that you can paste into the text box under the enhancement style setting if you want to build your own enhancement format.

I’ve found this especially useful when pushing prompts into a more structured JSON-style format, and it has worked particularly well for my NSFW prompting.

If you like your prompts a bit more controlled and machine-readable, this part can be very handy. If you prefer to keep things simple, you can ignore it completely and just use your normal prompt as-is.


Auto Prompt from Image

This section lets you load an image and have QwenVL-Mod generate a descriptive prompt from it.

That makes it useful for:

  • Reverse-engineering an existing image

  • Building a starting prompt from reference art

  • Converting illustrated or anime images into a more descriptive prompt structure

A very useful workflow here is to let the model describe the image, then copy that result and refine it manually before using it as your main generation prompt.


Image-to-Image subgraph

This section allows you to load an image and use it for img2img.

Important:

  • When using img2img, go to the Latent Switch node underneath and manually select input 1

  • For img2img runs, I also recommend reducing the denoise in the Global subgraph to around 0.8 as a starting point

From there, adjust the denoise value to suit the result you want. Lower values will preserve more of the source image. Higher values will allow the model to reinterpret it more aggressively.

One thing this works very nicely for is turning anime images into realistic images. That is one of the main reasons I built these extra tools into the workflow.


My anime-to-realistic method

  1. Load the anime image into the Image 2 Image subgraph

  2. Enable the Auto Prompt subgraph and load the same image there as well

  3. Run one image to get a descriptive prompt

  4. Copy that result into the main green prompt box

  5. Disable the Auto Prompt section

  6. Add this line to the top of the prompt:

Create an incredibly lifelike cinematic style realistic image that is indistinguishable from reality:
  1. Then remove any obvious references to anime, animation, cel shading, and similar wording

  2. Run again and refine with denoise and prompt changes

From there it becomes a bit of a balancing act between denoise, prompt wording, and whether you want to enable prompt enhancement.

And if that still isn’t enough...

Enable the ControlNet subgraph.


ControlNet subgraph

Once the subgraph is enabled, enter into it and turn on the groups you want to use.

For anime-to-realistic conversions, I would usually start with:

  • Loaderthis needs to be active every time

  • Depth Anything — usually my first choice

The built-in options include:

  • Depth Anything V3

  • Depth Anything V2

  • Canny

  • OpenPose

A good simple test is to use the same source image in img2img and ControlNet together, then let Depth Anything help hold the structure while Z-Image pushes it into realism.


Post Processing subgraph

This section is enabled with settings that I’ve found work well for me:

  • A little sharpen

  • A little film grain

They are there to add a bit of bite and texture to the final image, but definitely try turning them off or adjusting them to suit your own taste. Some images benefit from them more than others.


Upscaling options

SeedVR2 Upscale

This is the first optional upscale path. It includes its own post-processing as well.

I do not use it on every image, because I’m usually happy generating at 2MP, but it is there if you want to push detail further.

If you enable it, just be aware that it is doing real work and can take a while.

Ultimate SD Upscale

This is the second upscale option. Again, optional. Again, worth experimenting with.

And yes, if you enable both, the output from SeedVR2 will feed into Ultimate SD Upscale. This can absolutely give you huge results... but it can also mean leaving your PC running overnight for a single image 😅


Image saving and metadata

One thing I nearly forgot to mention, and it is actually quite an important part of the workflow, is the way the Image Saver has been set up.

I have configured it so that the saved image carries useful generation data with it, rather than just dumping out a plain image file. That means the workflow is designed to preserve important metadata for later reference, posting, and organisation.

Most importantly, I’ve made it so that the saved data includes:

  • The checkpoints used

  • The LoRAs used

  • Prompt-related information

  • Other generation settings such as dimensions and sampler data

That makes the workflow much more useful if you like to go back through older generations and remember exactly how an image was made. It also makes life easier when posting to places like CivitAI, because the generation data is much more complete and useful.

So if you are someone who likes keeping track of what model mix, LoRAs, and settings created a particular image, this should help a lot. 🙂

A small but useful quality-of-life feature in this workflow is that I’ve gone out of my way to make the saved metadata more complete, including the checkpoints and LoRAs used. That might sound like a small thing, but it makes revisiting old generations, troubleshooting results, and posting much easier.


  1. Install the custom node

  2. Load your ZiB and ZiT checkpoints

  3. Add any LoRAs you want

  4. Set your aspect ratio, orientation, and megapixels

  5. Write your main prompt and negative prompt

  6. Run a basic text-to-image test first

  7. Then start enabling extras one at a time:

    • Wildcard / Prompt Enhancement

    • Auto Prompt

    • Img2Img

    • ControlNets

    • Upscaling

That way, if something breaks, you’ll have a much better idea of which section caused it.


A few final notes

There are a lot of moving parts in this workflow. I’m very aware of that.

I may not always be technical enough to solve every issue, but if you do find an obvious bug or error, please let me know and I will try to fix it when I can.

More improvements and simplifications will come over time. This is very much a living workflow rather than a finished polished product.

Most importantly: please enjoy it. I hope it helps you make cool things.


If you use it...

If you make something with this workflow, please consider submitting your creations through the workflow page using the Add Post button.

That helps show other people that this resource is being used, and it lets me see what you’ve made, which genuinely motivates me to keep improving and sharing more workflows.

Thanks for checking it out, and happy generating. ✨