Creates a speaking avatar. MULTI-TALK - Workflow is based on Kijai's workflow.
Please note version 3.2 allows for more body movement.
Updated to version 3.0 (which handles single and 2 person talking in 1 workflow)
Kijai has made a lot of recent changes, so switched to newer nodes.
Start with a photo (avatar) and create a speaking video.
Most models
https://huggingface.co/Kijai/WanVideo_comfy/tree/main
https://huggingface.co/MeiGen-AI/MeiGen-MultiTalk/resolve/main/multitalk.safetensors
Some models used in this workflow
wan2.1_i2v_480p_14B_fp8_e4m3fn.safetensors
Wan2_1_VAE_bf16.safetensors
Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32.safetensors
multitalk.safetensors
umt5-xxl-enc-bf16.safetensors
For 2 speakers :
Record voice 1 and voice 2 separately.
You will then need to use the Mask Editor and mask the faces(and even the body if you like) of the 2 people that you want to speak.
Finally, look at the masks, and make sure the masks are the correct speaking order.
If they are not adjust the streams in "Mask Batch" Node. This first mask you see is the first speaker.
Note: Sometimes both speakers are speaking simultaneously. There is no fix for this that I am aware of. It doesn't happen all the time, but it does happen.
In my testing :
A 14 second video at 768x384 took 8 minutes to render on a 3090 with 24GB VRAM, 64GB RAM. This is with Sage Attn enabled.
Start with a photo (avatar) and create a speaking video.
go to your "custom_nodes" folder
if this sub folder, ComfyUI-WanVideoWrapper, already exists, you may want to delete it (to do a fresh install)
UPDATE THE WAN VIDEO WRAPPER (Has to be done with git because we are using code
that is in progress)
git clone https://github.com/kijai/ComfyUI-WanVideoWrapper
-- Portable version only.
python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\requirements.txt
https://github.com/kijai/ComfyUI-WanVideoWrapper/
Please note this uses a lot of Main Memory (RAM)
You must have at least 3 seconds of video for this to work correctly.