santa hat
deerdeer nosedeer glow
Sign In

ComfyUI Text to Image to Video with Stable Video Diffusion & RiFE

ComfyUI Text to Image to Video with Stable Video Diffusion & RiFE

Mildly organised template for doing Stable Diffusion txt2img, then sending that to Stable Video Diffusion, then sending that to RiFE.

Groups are placed less than ideally because the wires display over the text instead of under, so I was trying to avoid that. If that didn't happen I would have made the groups placement more efficient and logical.

Custom Nodes:

Workflow:

  1. Set settings for Stable Diffusion, Stable Video Diffusion, RiFE, & Video Output. Since Stable Video Diffusion doesn't accept text inputs, the image needs to come from somewhere else, or it needs to be generated with another model like Stable Diffusion v1.5.

  2. Initialize latent.

  3. Send latent to SD KSampler.

  4. Decode latent.

  5. Send decoded latent to Stable Video Diffusion img2vid Conditioning

  6. Send conditioned latent to SVD KSampler.

  7. Decode latents.

  8. Send decoded latents to RiFE.

  9. Save Video Output.

17

Comments