Fast Wan 2.2 Image-to-Video workflow using a dual GGUF cascade strategy: the first 2 sampling steps run on the q5_High quantization for strong structural quality, then hands off to q5_Low for the final 2 steps — giving you speed without sacrificing fidelity.
What you get:
4 total steps (2+2 split: q5High then q5Low)
- Portrait 480x832 @ 49 frames / 16fps
- Runs on VRAM-constrained GPUs (GGUF keeps memory low)
- UMT5 XXL fp8 text encoder + Wan 2.1 VAE
- euler sampler, CFG 1
Models needed:
DasiwaWAN22I2V14BBoundbiteV10_q5High.gguf
- DasiwaWAN22I2V14BBoundbiteV10_q5Low.gguf
- umt5_xxl_fp8_e4m3fn_scaled.safetensors
- wan_2.1_vae.safetensors
How it works:
The trick is using the higher-fidelity quantization for the initial denoising pass (where structure and composition are established), then letting the faster q5_Low carry it home. Near-q5High quality at closer to q5_Low speed.
