π¬ Generation Modalities
πβ‘οΈπ₯ TextToVideo
Create completely new videos from scratch using text prompts or optional audio input.
πΌοΈβ‘οΈπ₯ ImageToVideo
Animate static reference images using text prompts or optional audio input.
π₯βͺ VideoToVisualDub
Generates synchronized audio tracks such as ambience and speech, driven by the video visuals and prompts, while keeping the original video.
π₯βͺ VideoToMaskedFaceGen (Warning: With the new ComfyUI update v0.9.1, Inpaint doesnβt work anymore.)
Regenerate masked facial areas. Control expressions, lip-sync, and identity using prompts or optional audio input.
βΉοΈ Info: The input video resolution after scaling is the same as the output video resolution. Internal spatial downscaling or upscaling is deactivated.
ποΈ Audio Input Settings
βͺ π No Audio Input
No external audio file is used. The AI generates completely new audio based on your text prompt.
βͺ π ++ Audio Input
Upload an existing voice or music file to drive the animation, for example for lip-sync.
* Note: Do not use this setting for VisualDub.

