Sign In

Z-Image Turbo - Quantized for low VRAM

302

3k

110

Updated: Dec 1, 2025

base model

Verified:

SafeTensor

Type

Checkpoint Trained

Stats

2,150

2

Reviews

Published

Nov 28, 2025

Base Model

ZImageTurbo

Hash

AutoV2
59610861D4

License:

Z-Image Turbo is a distilled version of Z-Image, a 6B image model based on the Lumina architecture, developed by the Tongyi Lab team at Alibaba Group. Source: https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

I've uploaded quantized versions from bf16 to fp8, meaning the weights had their precision - and consequently their size - halved for a substantial performance boost while keeping most of the quality. Inference time should be similar to regular "undistilled" SDXL, with better prompt adherence and resolution/details. Ideal for weak(er) PCs.

Features

  • Lightweight: the Turbo version was trained at low steps (5-15), and the fp8 quantization is roughly 6 GB in size, making it accessible even to low-end GPUs.

  • Uncensored: many concepts censored by other models (<cough> Flux <cough>) are doable out of the box.

  • Good prompt adherence: comparable to Flux.1 Dev's, thanks to its powerful text encoder Qwen 3 4B.

  • Text rendering: comparable to Flux.1 Dev's, some say it's even better despite being much smaller (probably not as good as Qwen Image's though).

  • Style flexibility: photorealistic images are its biggest strength, but it can do anime, oil painting, pixel art, low poly, comics, watercolor, vector art / flat design, comic book, sketch, pop art, infographic, etc.

  • High resolution: capable of generating up to 4MP resolution natively (i.e. before upscale) while maintaining coherence.

Dependencies

Instructions

Workflow and metadata are available in the showcase images.

  • Steps: 5 - 15 (6 - 9 is the sweet spot)

  • CFG: 1.0. This will ignore negative prompts, so no need for them.

  • Sampler/scheduler: depends on the art style. Here are my findings so far:

    • Photorealistic:

      • Favourite combination for the base image: euler + beta, simple or bong_tangent (from RES4LYF) - fast and good even at low (5) steps.

      • Most multistep samplers (e.g.: res_2s, res_2m, dpmpp_2m_sde etc) are great, but some will be 40% slower at same steps. They might work better with a scheduler like sgm_uniform.

      • Almost any sampler will work fine - sa_solver, seeds_2, er_sde, gradient_estimation.

      • Some samplers and schedulers add too much texture, you can adjust it by increasing the shift (e.g.: set shift 7 in ComfyUI's ModelSamplingAuraFlow node).

      • Some require more steps (e.g.: karras)

    • Illustrations (e.g.: anime):

      • res_2m or rk_beta produce sharper and more colourful results.

    • Other styles:

      • I'm still experimenting. Use euler (or res_2m) + simple just to be safe for now.

  • Resolution: up to 4MP native. When in doubt, use same as SDXL, Flux.1, Qwen Image, etc (it works even as low as 512px, like SD 1.5 times). Some examples:

    • 896 x 1152

    • 1024 x 1024

    • 1216 x 832

    • 1440 x 1440

    • 1024 x 1536

    • 2048 x 2048

  • Upscale and/or detailers are recommended to fix smaller details like eyes, teeth, hair. See my workflow embedded in the main cover image.

    • If going over 2048px in either side, I recommend the tiled upscale method i.e. using UltimateSD Upscale at low denoise (<= 0.3).

    • Otherwise, I recommend your 2nd pass KSampler to either have a low denoise (< 0.5) or to start the sampling at a later step (e.g.: from 5 to 9 steps).

    • Either way, I recommend setting the shift to 7 to avoid noisy textures in your results. Keep in mind that some schedulers (e.g.: bong_tangent) may override the shift with its own.

    • At this stage, you may use even samplers that didn't work well in the initial generation. For most cases, I like the res_2m + simple combination.

  • Prompting: officially they say long and detailed prompts in natural language works best, but I tested with comma-separated keywords/tags, JSON, whatever... either should work fine. Keep it in English or Mandarin for more accurate results.

FAQ

  • Is the model uncensored?

    • Yes, it might just not be well trained on the specific concept you're after. Try it yourself.

  • Why do I get too much texture or artifacts after upscaling?

    • See instructions about upscaling above.

  • Does it run on my PC?

    • If you can run SDXL, chances are you can run Z-Image Turbo fp8. If not, might be a good time to purchase more RAM or VRAM.

    • All my images were generated on a laptop with 32GB RAM, RTX3080 Mobile 8GB VRAM.

  • How can I get more variation across seeds?

    • Generate the initial noise yourself (e.g.: start with img2img, or generate empty prompt for the first steps, etc); or

    • Give clear instructions in prompt, something like give me a random variation of the following image: <your prompt>)

  • I'm getting an error on ComfyUI, how to fix it?

    • Make sure your ComfyUI has been updated to the latest version. Otherwise, feel free to post a comment with the error message so the community can help.

  • Is the license permissive?

    • It's Apache 2.0, so quite permissive.