Download
1 variant available
License:
HiDream-O1-Image (codename Peanut) is an 8B text-to-image foundation model from HiDream.ai, built on a Pixel-level Unified Transformer (UiT) that operates end to end on raw pixels with no external VAE or separate text encoder. The same checkpoint handles text-to-image, instruction-based editing, and multi-reference subject personalization natively at up to 2,048 x 2,048.
Originally released by HiDream.ai on Hugging Face. All credit for the model goes to the HiDream.ai team. Civitai is hosting a mirror so creators can run it on-site - head to the original repo for weights, updates, the technical report, and to follow the project directly.
Built by
- HiDream.ai - upstream organization and authors of the technical report.
Versions mirrored on Civitai
Two checkpoints are mirrored, both as fp8 SafeTensors:
- Standard - full 50-step model. Best quality. Guidance scale 5.0.
- Dev - distilled 28-step model. Faster, guidance scale 0.
HiDream also publishes a 200B+ Pro variant upstream, but weights are not public, so it is not mirrored here.
One model, three tasks
The same checkpoint handles text-to-image, instruction-based editing with a single reference image, and multi-reference subject-driven personalization with up to ten reference images. Mode is selected by what you pass at inference - no separate adapters or LoRAs needed.
Native 2K and multilingual text
Direct synthesis up to 2,048 x 2,048 without upscaling. Strong long-text rendering in both English and Chinese (LongText-Bench 0.979 EN / 0.978 ZH), 0.90 on GenEval for compositional prompts, and 89.83 on DPG-Bench for dense prompt alignment.
Reasoning-driven prompt agent (upstream only)
The HiDream repo ships a separate "thinking" prompt agent (Gemma-4-31B or an OpenAI-compatible API) that rewrites raw instructions into self-contained prompts before generation. That agent is not part of the Civitai mirror - if you want it, run upstream locally.
Links
- Hugging Face: HiDream-ai/HiDream-O1-Image
- GitHub: HiDream-ai/HiDream-O1-Image
- Technical report: PDF
- License: MIT
