Sign In

WAN 2.2 IMAGE to VIDEO with Caption and Postprocessing

106

3.4k

66

Type

Workflows

Stats

1,211

0

Reviews

Published

Aug 10, 2025

Base Model

Wan Video 2.2 I2V-A14B

Hash

AutoV2
EE31EA1A1A
Howling Aurora
tremolo28's Avatar

tremolo28

Workflow: Image -> Autocaption (Prompt) -> WAN I2V with Upscale and Frame Interpolation and Video Extension

  • Creates Video Clips with 480p or 720p resolution.

There is a Florence Caption Version and a LTX Prompt Enhancer (LTXPE) version. LTXPE is more heavy on VRAM


MultiClip: Wan 2.2. 14B I2V Version supporting LightX2V Wan 2.2. Loras to create clips with 4-6 steps and extend up to 3 times, see examples posted with 15-20sec of length.

There is a normal version which allows to use own prompts and a version using LTXPE for autoprompting. Normal version works well for specific or NSFW clips with Loras and the LTXPE is made to just drop an image, set width/height and hit run. The clips are combined to one full video at the end.

  • supporting new Wan 2.2. LightX2v Loras for low steps

  • in addition you are able to inject the "old" LightX2v Wan 2.1 Lora. This can help to avoid slow motion clips and can introduce more dynamic motion.

  • supporting Wan 2.2. Loras per sequence

  • Single Clip Versions included, which correspond to below V1.0 Workflow with additional Lora loader for "old" Wan 2.1. LightX2v Lora.

Since Wan 2.2 uses 2 models, the workflow gets complex. Still recommend to check the Wan 2.1 MultiClip Version, which is much leaner and has a rich selection of Loras. It can be found here: https://civitai.com/models/1309065?modelVersionId=1998473


V1.0 WAN 2.2. 14B Image to Video workflow with LightX2v I2V Wan 2.2 Lora support for low steps (4-8 steps)

  • Wan 2.2. uses 2 models to process a clip. A High Noise and a Low Noise model, processed in sequence.

  • compatible with LightX2v Loras to process clips fast with low steps.

  • compatible to some of the "old" Wan2.1 Loras and "new" Wan 2.2. Loras

  • See notes in workflow and Tips below.

Models can be donwloaded here:

Models (Low & High Noise required, pick the ones matching your Vram): https://huggingface.co/bullerwins/Wan2.2-I2V-A14B-GGUF/tree/main

LightX2v Loras for Wan 2.2. (I2v, Hi and Lo): https://huggingface.co/Kijai/WanVideo_comfy/tree/main/Wan22-Lightning

LightX2v Lora (old Wan 2.1): https://huggingface.co/lightx2v/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v/tree/main/loras

Vae (same as Wan 2.1): https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/vae

Textencoder (same as Wan 2.1): https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/text_encoders


WAN 2.2. I2V 5B Model (GGUF) workflow with Florence or LTXPE auto caption

  • lower quality than 14B model

  • 720p @ 24 frames

  • with FastWan Lora use CFG of 1 and 4-5 Steps, place a LoraLoader node after Unet Loader to inject Lora

FastWan Lora: https://huggingface.co/Kijai/WanVideo_comfy/tree/main/FastWan

Model (GGUF, pick model matching your Vram): https://huggingface.co/QuantStack/Wan2.2-TI2V-5B-GGUF/tree/main

VAE: https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/vae

Textencoder (same as Wan 2.1) :https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/text_encoders


location to save those files within your Comfyui folder:

Wan GGUF Model -> models/unet

Textencoder -> models/clip

Vae -> models/vae


Tips (for 14b Model):