━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✨ **Krea2 Turbo — QwenVL Dual-Mode Workflow**
ComfyUI · Krea-2 Community License · Fast DiT Turbo
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Fast, professional image generation using the Krea-2 Turbo FP8 diffusion model in ComfyUI. Two fully automatic modes: **Mode A** — type a subject keyword and QwenVL expands it into a rich, photorealistic prompt. **Mode B** — load any reference image and QwenVL analyzes it, then generates an image inspired by its style and composition. Switch between modes by changing a single number—no node rewiring needed. Landscape (16:9) and portrait (9:16) built-in. Tested on RTX 5080 16GB.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✨ **Features**
✅ **Mode A: Keyword → Auto-Expand** — Type subject (e.g., "mountain at golden hour") → QwenVL PromptEnhancer expands to rich visual prompt → generate
✅ **Mode B: Reference Image → Style Capture** — Drop any reference image → QwenVL describes its style & composition → generates new inspired image
✅ **Dual Orientation** — Single index switch: 1920×1088 landscape (16:9) or 1088×1920 portrait (9:16); no node rewiring
✅ **Krea-2 Original Architecture** — NOT FLUX-derived; DiT 12.9B model from Krea; FP8 quant by AlperKTS (~7 GB)
✅ **Stable Sampler Config** — Euler sampler, simple scheduler, CFG 1.0, 10 steps — reliable, no tuning needed
✅ **~8 GB VRAM Active** — Plenty of headroom on 16 GB; safe for concurrent desktop use
✅ **Fast on RTX 5080** — 30–60 sec per 1920×1088 image; 40–70 sec in Mode B (includes VLM analysis)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📦 **Required Models** (3 files, ~10–11 GB)
• krea2_turbo_fp8.safetensors (~7 GB) — Main Krea-2 DiT diffusion model (FP8 quantized)
• qwen3vl_4b_fp8_scaled.safetensors (~2–3 GB) — Combined text + vision encoder (Qwen3-VL-4B, FP8)
• qwen_image_vae.safetensors (~1 GB) — Image VAE codec (encode/decode latents)
• Qwen3-VL-2B-Instruct (~2.5 GB, auto-downloads) — Vision model for prompt enhancement and image analysis
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⬇️ **Download Links** (all from one source)
📁 **ComfyUI/models/unet/**
• krea2_turbo_fp8.safetensors — https://huggingface.co/AlperKTS/Krea2_FP8
📁 **ComfyUI/models/clip/** (or text_encoders/)
• qwen3vl_4b_fp8_scaled.safetensors — https://huggingface.co/AlperKTS/Krea2_FP8
📁 **ComfyUI/models/vae/**
• qwen_image_vae.safetensors — https://huggingface.co/AlperKTS/Krea2_FP8
⚠️ *All 3 model files are on the same AlperKTS/Krea2_FP8 repo. Qwen3-VL-2B-Instruct auto-downloads on first use via the ComfyUI-QwenVL node (~2.5 GB, cached to ~/.cache/ after first run).*
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🧩 **Required Custom Nodes** (2 packs)
1. **ComfyUI-QwenVL** (AILab / 1038lab) — PromptEnhancer node (Mode A keyword expansion) + VL image analysis (Mode B image-to-prompt)
2. **ComfyUI-Easy-Use** (vjumpkung) — anythingIndexSwitch for mode toggle and orientation picker
Install via ComfyUI Manager (search each pack name) or:
• ComfyUI Manager → Search "ComfyUI-QwenVL" → Install
• ComfyUI Manager → Search "ComfyUI-Easy-Use" → Install
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🚀 **How to Use**
**Quick Start:**
1. Download all 3 model files → place in ComfyUI/models/ (unet / clip / vae)
2. Install 2 custom node packs via ComfyUI Manager
3. Load the workflow JSON into ComfyUI
4. Choose mode (top section):
- **🔀 Mode Switch = 0:** Keyword input → QwenVL auto-expands into full prompt
- **🔀 Mode Switch = 1:** Reference image → QwenVL analyzes → generates inspired output
5. Set orientation: **📐 Latent Switch = 0** (landscape) or **1** (portrait)
6. Queue → generate
**Mode A — Keyword → Prompt:**
- Edit "✍️ QwenVL Prompt Enhancer" node → type subject keyword
- QwenVL expands to 30–50 word visual prompt automatically
- Best for: creative brainstorming, mood-based generation
**Mode B — Reference Image → Output:**
- Set Mode Switch = 1 → load reference image in "📸 Load Reference Image" node
- QwenVL reads style, composition, lighting → generates inspired image
- Best for: style transfer, mood matching, "I want something like this but different"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⚙️ **Settings & Parameters**
• **Sampler** — euler (standard, stable)
• **Scheduler** — simple
• **Steps** — 10 (default; raise to 15–20 for refinement, lower to 6–8 for speed)
• **CFG Scale** — 1.0 (fixed for Krea2 Turbo; do not raise above 2.0)
• **Mode Switch** — 0 (keyword) / 1 (reference image)
• **Latent Switch** — 0 (landscape 1920×1088) / 1 (portrait 1088×1920)
• **Seed** — randomize (auto) or fix for reproducibility
• **Negative Prompt** — curated quality filter built-in (edit in ❌ NEGATIVE node)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
💡 **Performance Tips**
• **Keyword Mode** — Start vague ("lighthouse") and let QwenVL expand it; edit the generated prompt text if you want to steer it further
• **Reference Mode** — Sharp, well-composed reference images give better style transfer; low-contrast or abstract refs may confuse the VLM
• **Cold Start** — First queue takes ~1–2 min while Qwen3-VL-2B auto-downloads (~2.5 GB); all queues after that are fast
• **Speed Tuning** — 6 steps ≈ 25s; 10 steps ≈ 45s; 15 steps ≈ 70s. Quality gains plateau around 15 steps
• **Batch Generation** — Queue multiple times; model stays loaded, seed auto-randomizes each run
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔗 **Also check out**
For **still image generation** with a different architecture and Apache-2.0 licensing, see my **Z-Image-Turbo — QwenVL Dual-Mode** workflow (same dual-mode concept, different model) on my profile page.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📝 **Notes & AI Disclosure**
• **AI-Generated Content** — All example outputs are AI-generated by Krea-2 Turbo. Suitable for commercial and creative use (see licensing below)
• **Hardware Tested** — RTX 5080 16 GB VRAM, CUDA 9.2+
• **VRAM Usage** — ~8 GB active during sampling; safe on 16 GB with headroom
• **Output Ownership** — You own all outputs. Commercial use OK if revenue < $1M/yr (Krea-2 Community License)
• **Architecture Note** — Krea-2 is an original DiT 12.9B model (NOT FLUX-derived). Different licensing from Flux.1-dev workflows
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⭐ **Found this useful?**
• Like if it saved you time
• Comment your results — I read every one
• Follow for new ComfyUI workflows, all tested on 16 GB VRAM
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⚖️ **Model Attribution & Licensing**
**Krea-2 Turbo** (AlperKTS FP8 Quant)
• License: Krea-2 Community License — https://www.krea.ai/krea-2-licensing
• Architecture: Original DiT 12.9B (NOT FLUX-derived — Krea's own foundation model)
• Commercial use: ✅ OK if total annual revenue < $1,000,000 USD; Enterprise license required above this threshold
• FP8 quantization by AlperKTS is a permitted Derivative under Krea-2 Community License
• Attribution required: credit Krea-2 (krea.ai) and AlperKTS when distributing
• License verified: 2026-06-29
**Qwen3-VL-4B Text + Vision Encoder** (Alibaba Qwen)
• File: qwen3vl_4b_fp8_scaled.safetensors
• License: Apache 2.0 ✅ — Commercial use permitted
**Qwen Image VAE** (Alibaba Qwen)
• File: qwen_image_vae.safetensors
• License: Apache 2.0 ✅ — Commercial use permitted
**ComfyUI Custom Nodes**
• ComfyUI-QwenVL (1038lab): Apache 2.0 / BSD
• ComfyUI-Easy-Use (vjumpkung): Per upstream license
**Workflow JSON**
• License: CC0 Public Domain — free to use, modify, and redistribute without attribution (credit appreciated)
All example outputs are AI-generated. Model weights are third-party and covered by their respective licenses. Download weights separately from HuggingFace links above.