Workflow: Image -> Autocaption (Prompt) -> WAN I2V with Upscale and Frame Interpolation and Video Extension
Creates Video Clips with 480p or 720p resolution.
There is a Florence Caption Version and a LTX Prompt Enhancer (LTXPE) version. LTXPE is more heavy on VRAM
MultiClip: Wan 2.2. 14B I2V Version supporting LightX2V Wan 2.2. Loras to create clips with 4-6 steps and extend up to 3 times, see examples posted with 15-20sec of length.
There is a normal version which allows to use own prompts and a version using LTXPE for autoprompting. Normal version works well for specific or NSFW clips with Loras and the LTXPE is made to just drop an image, set width/height and hit run. The clips are combined to one full video at the end.
supporting new Wan 2.2. LightX2v Loras for low steps
in addition you are able to inject the "old" LightX2v Wan 2.1 Lora. This can help to avoid slow motion clips and can introduce more dynamic motion.
supporting Wan 2.2. Loras per sequence
Single Clip Versions included, which correspond to below V1.0 Workflow with additional Lora loader for "old" Wan 2.1. LightX2v Lora.
Since Wan 2.2 uses 2 models, the workflow gets complex. Still recommend to check the Wan 2.1 MultiClip Version, which is much leaner and has a rich selection of Loras. It can be found here: https://civitai.com/models/1309065?modelVersionId=1998473
V1.0 WAN 2.2. 14B Image to Video workflow with LightX2v I2V Wan 2.2 Lora support for low steps (4-8 steps)
Wan 2.2. uses 2 models to process a clip. A High Noise and a Low Noise model, processed in sequence.
compatible with LightX2v Loras to process clips fast with low steps.
compatible to some of the "old" Wan2.1 Loras and "new" Wan 2.2. Loras
See notes in workflow and Tips below.
Models can be donwloaded here:
Models (Low & High Noise required, pick the ones matching your Vram): https://huggingface.co/bullerwins/Wan2.2-I2V-A14B-GGUF/tree/main
LightX2v Loras for Wan 2.2. (I2v, Hi and Lo): https://huggingface.co/Kijai/WanVideo_comfy/tree/main/Wan22-Lightning
LightX2v Lora (old Wan 2.1): https://huggingface.co/lightx2v/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v/tree/main/loras
Vae (same as Wan 2.1): https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/vae
Textencoder (same as Wan 2.1): https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/text_encoders
WAN 2.2. I2V 5B Model (GGUF) workflow with Florence or LTXPE auto caption
lower quality than 14B model
720p @ 24 frames
with FastWan Lora use CFG of 1 and 4-5 Steps, place a LoraLoader node after Unet Loader to inject Lora
FastWan Lora: https://huggingface.co/Kijai/WanVideo_comfy/tree/main/FastWan
Model (GGUF, pick model matching your Vram): https://huggingface.co/QuantStack/Wan2.2-TI2V-5B-GGUF/tree/main
VAE: https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/vae
Textencoder (same as Wan 2.1) :https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/text_encoders
location to save those files within your Comfyui folder:
Wan GGUF Model -> models/unet
Textencoder -> models/clip
Vae -> models/vae
Tips (for 14b Model):
Confused about Wan 2.2 and Loras? Check this threads: https://civitai.com/models/1823416?modelVersionId=2063446&dialog=commentThread&commentId=890870
Wan 2.2. I2V Prompting Tips: https://civitai.com/models/1823416?modelVersionId=2063446&dialog=commentThread&commentId=890880
What GGUF Model to download? I usually go for a model with around 10gb of size with my 16gb Vram/64gb Ram. (i.e. "...Q4_K_M.gguf" model)
To counter slow motion effects, you can try reducing strength of High Noise Lora to like 0.5 with 6-8 steps (black "LoraLoaderModelOnly LightX Lora Loader Hi" node)