Updated: Aug 30, 2025
base model㪠Introduction
Unleash a new form of AI creativity! This workflow harnesses the specialized power of the Wan 2.2 Fun 5B Inpainting model to generate stunning, seamless video animations that morph one image into another.
Forget text prompts driving the motion. Here, you provide a starting image and an ending image. The AI intelligently analyzes both and generates a fluid video transition between them, interpreting the content and creating a natural, often dreamlike, transformation. Perfect for creating mesmerizing loops, concept art evolution, or simply bringing two ideas together in a video.
This workflow is optimized for accessibility, utilizing GGUF quantization to run this powerful model efficiently on consumer hardware.
β¨ Key Features & Highlights
Image-to-Image Morphing: The core feature. Input a start and end image, and let the AI generate the transitional video.
GGUF Quantization Support: Powered by the
LoaderGGUF
andClipLoaderGGUF
nodes, making the 5B parameter model runnable without a top-tier GPU.Lightning LoRA for Speed: Integrates a 4-step LoRA, significantly speeding up the generation process compared to standard sampling.
Simple & Intuitive Setup: The workflow is cleanly grouped into logical steps: Load Models, Upload Images, Set Prompt, and Generate.
High-Quality Output: Configured for a high resolution (
944x944
) and smooth frame rate (24 FPS
), packaged into an MP4 file by the robustVHS_VideoCombine
node.
π§© How It Works (The Magic Behind the Scenes)
This workflow is elegant in its execution:
Load Models (GGUF): The
LoaderGGUF
node loads the quantizedWan2.2-Fun-5B-InP
model, and theClipLoaderGGUF
node loads the UMT5 text encoder. A standardVAELoader
node loads the Wan VAE for decoding.Upload Your Image Pair: This is the crucial step. You provide two inputs:
start_image
: The initial state of your animation.end_image
: The final state you want to transition to.
Define the Vibe (Prompt): While the images drive the motion, the text prompt helps define the style and quality of the entire generated sequence. The included positive prompt creates a "dreamy, Q-style" look, while the negative prompt filters out common artifacts.
Inpainting Magic (
WanFunInpaintToVideo
): This specialized node is the engine. It takes your two images, encodes them along with your prompts, and prepares a latent video representation that transitions from the start to the end image.Fast Sampling: The prepared latent is passed to the
KSampler
, which uses the 4-step Lightning LoRA to rapidly denoise the sequence, creating the final frames of the animation.Decode & Export: The VAE decodes the latent frames into images, and the
VHS_VideoCombine
node seamlessly compiles them into a final, high-quality MP4 video file.
βοΈ Instructions & Usage
Prerequisite: Download Models
You must download the following model files and place them in your ComfyUI models
directory.
Essential Models:
Wan2.2-Fun-5B-InP-Q8_0.gguf
β Place in/models/unet/
(or/models/diffusion/
)umt5-xxl-encoder-q4_k_m.gguf
β Place in/models/clip/
wan_2.2_vae.safetensors
β Place in/models/vae/
For the 4-Step Lightning Pipeline:
Wan2_2_5B_FastWanFullAttn_lora_rank_128_bf16.safetensors
β Place in/models/loras/
(Note: The workflow currently points to this, but the note mentions a different official LoRA. You may want to clarify which one to use in your upload).
Loading the Workflow
Download the provided
video_wan2_2_5B_fun_inpaint.json
file.In ComfyUI, drag and drop the JSON file into the window or use the Load button.
Running the Workflow
Upload Your Image Pair:
In the "LoadImage" node on the left, upload your
start_image.png
.In the "LoadImage" node on the right, upload your
end_image.png
.Tip: For best results, use images that are similar in composition or theme for a more coherent morph.
Set Your Prompt (Optional but Recommended):
Modify the text in the "CLIP Text Encode (Positive Prompt)" node to influence the style of the entire generated video (e.g., "watercolor style", "cyberpunk", "realistic").
The negative prompt is pre-filled and is great for general use.
Queue Prompt! Watch as the AI dreams up a unique animation that bridges your two images.
β οΈ Important Notes & Tips
Image Guidance: The strength of this model is in the images. The text prompt plays a secondary role in styling. The core narrative of the video is the transition between your two uploaded images.
Length Setting: The
WanFunInpaintToVideo
node has alength
parameter set to121
frames. At24 FPS
, this will yield roughly a 3.3-second video. You can adjust this value for shorter or longer animations, but be mindful of VRAM constraints.Resolution: The workflow is set to
944x944
. You can adjust thewidth
andheight
in theWanFunInpaintToVideo
node, but this will also impact VRAM usage and performance.Lightning LoRA: The 4-step LoRA is used for speed. If you encounter quality issues or want to try a different style, you can adjust the strength in the
LoraLoaderModelOnly
node or try the officialwan2.2_i2v_lightx2v_4steps_lora
mentioned in the node's info.
π Example Results
Start Image: A closed flower bud.
End Image: The same flower in full bloom.
Prompt: "A beautiful time-lapse of a flower blooming, macro photography, sharp focus, cinematic lighting."
(You would embed a short video example generated by this workflow here)
Another idea: Start with a sketch, end with the fully rendered artwork.
π Download & Links
Wan 2.2 Fun 5B Inpaint GGUF Model: HuggingFace - QuantStack/Wan2.2-Fun-5B-InP-GGUF
umt5-xxl-encoder-q4_k_m.gguf: https://huggingface.co/city96/umt5-xxl-encoder-gguf/tree/main
Wan2.2_VAE.safetensors: https://huggingface.co/QuantStack/Wan2.2-Fun-5B-InP-GGUF/tree/main/vae
Wan2_2_5B_FastWanFullAttn_lora_rank_128_bf16.safetensors: https://huggingface.co/Kijai/WanVideo_comfy/blob/main/FastWan/Wan2_2_5B_FastWanFullAttn_lora_rank_128_bf16.safetensors
Conclusion
π Conclusion
This workflow opens up a fascinating and less-explored avenue of AI video generation. By moving beyond text-to-video to image-guided-video, it allows for precise and creative control over the starting and ending points of an animation. The use of GGUF makes this creative power accessible to a wide audience.
It's perfect for artists, designers, and anyone looking to create unique, seamless transitions and visual stories. Experiment with different image pairs and prompts to discover the full potential of this "Fun" model.
We can't wait to see what you morph! Share your creations in the comments below.