Updated: Feb 1, 2026
base modelfp8 quantized Z-Image for ComfyUI using its quantization feature "TensorCoreFP8Layout".
Scaled fp8 weights. higher precision than pure fp8.
Use hardware fp8 on supported GPUs (only for turbo, see below).
Also with "mixed precision". Important layers remain in bf16.
There is no "official" fp8 version for z-image from ComfyUI, so I made my own.
All credit belongs to the original model author. License is the same as the original model.
Note: Those features are officially supported by ComfyUI. This file is just a weight file.
Use ComfyUI built-in loader nodes to load.
If you got error, report to ComfyUI repo. Not here.
Base
Quantized Z-Image. Aka. the "base" version of z-image.
https://huggingface.co/Tongyi-MAI/Z-Image
Note: No hardware fp8, all calculations are still using bf16. This is intentional. Hardware fp8/4 etc. do not work well with LoRA.
Turbo
Quantized Z-Image-Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
It supports hardware fp8. More about hardware fp8, and hardware requirement, see ComfyUI TensorCoreFP8Layout.
Qwen3 4b
Update: not recommended.
Comfyui-gguf has supported qwen3. So, use gguf instead. Recommend:
https://huggingface.co/unsloth/Qwen3-4B-GGUF/blob/main/Qwen3-4B-UD-Q8_K_XL.gguf
Why gguf? gguf q8 has a little bit higher precision than comfyui built-in scaled fp8.
===
Quantized Qwen3 4b.
https://huggingface.co/Qwen/Qwen3-4B
Scaled fp8 + mixed precision.
Early (embed_tokens, layers.[0-1]) and final (layers.[34-35]) layers are still in BF16.

