Introduction
This workflow uses Segment Anything v2 to segment out the part of the original video to be replaced. Then the Set Latent Noise Mask node is used before passing the VAE encoded video into the KSampler.
You will provide a reference image to be used with IPAdapter and also a prompt to guide the generation of the new object. The depth controlnet is used to give some guidance on what the shape of the replaced object should be.
Models
SD1.5 Checkpoint: I use a LCM checkpoint so I do not need to use the AnimateLCM LoRA. You can use your favourite SD1.5 checkpoint. https://civitai.com/models/4384?modelVersionId=252914
AnimateLCM_sd15_t2v.ckpt: Download from https://huggingface.co/wangfuyun/AnimateLCM/tree/main and put in models/animatediff_models
control_v11f1p_sd15_depth_fp16.safetensors and depth_anything_v2_vitl.pth. Download using Manager.
IPAdapter PLUS: download ip-adapter-plus_sd15.safetensors using ComfyUI manager. Also download ViT-H and ViT-G clip_vision models using the ComfyUI manager.
SAM2 models should automatically download
Custom Nodes
Install with Manager:
ComfyUI-VideoHelperSuite
Crystools
KJNodes for ComfyUI
ComfyUI's ControlNet Auxiliary Preprocessors
ComfyUI-Advanced-ControlNet
AnimateDiff Evolved
ComfyUI_IPAdapter_plus