Updated: Aug 8, 2025
styleThis workflow extends the prompt adherence of Wan 2.2 by using qwen in the first stage at a low resolution. Brings remarkably good detail to low resolution outputs.
Hardware: RTX 3090 24GB
Models : Qwen Q4 GGUF + Wan 2.2 Low GGUF
Elapsed Time E2E (2k Upscale) : 300s cold start, 80-130s (0.5MP - 1MP)
Main Takeaway - Qwen Latents are compatible with Wan 2.2 Sampler
There are two stages:
1stage: (42s-77s). Qwen sampling at 0.75/1.0/1.5MP
2stage: (~110s): Wan 2.2 4 step
1st stage can go to VERY low resolutions. Haven't test 512x512 YET but 0.75MP works
* Text - text gets lost at 1.5 upscale , appears to be restored with 2.0x upscale. I've included a prompt from the Comfy Qwen blog
* Landscapes (Not tested)
* Cityscapes (Not tested)
Interiors (untested)
* Portraits - Closeups Not great (male older subjects fare better). Okay with full body, mid length. Ironically use 0.75 MP to smooth out features. It's obsessed with freckles. Avoid. This may be fixed by https://www.reddit.com/r/StableDiffusion/comments/1mjys5b/18_qwenimage_realism_lora_samples_first_attempt/ by the never sleeping u/AI_Characters