Sign In

Notes on creating morphing animations using AnimateDiff/ComfyUI

ComfyUI:

There are great resources available online such as https://civitai.com/models/372584/ipivs-morph-img2vid-animatediff-lcm-hyper-sd and https://civitai.com/models/590964/wbmix-img2vid-animatediff-lcm and https://civitai.com/models/692801/colorsmix-img2vid-animatediff-lcm and ...

I did not use the above workflows as it is but studied one by one and made adjustments to my desire and resources. Took sometime to figure out some parts. Here are my notes.

System: Intel CPU with built-in GPU (VRAM/RAM shared, no dedicated VRAM), using SD1.5 models RAM usage tops at 16GB (all set float16 except VAE in float32)

  • Used LCM Checkpoints for sharper and more vivid colors, e.g., https://civitai.com/models/306814/photon-lcm

  • Script to Decode samples one by one and then saving the output as video (custom code using ffmpeg) to avoid system crash due to OOM or other issues,

  • Input images all fit to 512,

  • Custom code to generate Input masks (fitting to 512x512), saved as a Input Video for transitioning between Images,

  • No ControlNet was used,

  • Only IPAdapter, one for each input Image, (w=1, w-type=ease-in-out, combine=concat, start=0, end=1, scale=V only),

  • Motion model of those with LCM, no motion lora, motion scale=1.25, overlap=4, method=pyramid, beta-schedule=autoselect,

  • KSampler seed=0, steps=11, sampler=LCM, scheduler=simple,

The above setup takes about 50 minutes for 96 frames of 512x512.

The Video attached here is down-scaled, down-graded to make it small for upload. The original is very sharp and clear.

  • Input images play significant role in the quality of transition and overall quality of the video,

  • Checkpoint is also crucial as most non-LCM models will produce very pale and fady outputs,

  • To clarify that use of ControlNet with QR Code did not work as expected, so I removed it entirely, and instead made the transition guided by the Input animated mask,

  • I did not use UpScaling at all,

In summary, the standard image generation workflow in ComyUI includes: [Model] -> [Prompt] -> [KSampler] -> [Output]. The workflow explained here injects [IPAdapter] -> [AnimateDiff] before [KSampler], that's it.

As for prompt, I simply put "high quality" as contents of the Input Images are well transferred into the creation via IPAdapters.

Cheers, ZerOne