Sign In

Image-to-video Comparison Workflow

10
204
12
Type
Workflows
Stats
204
Reviews
Published
Jul 1, 2024
Base Model
SDXL 1.0
Hash
AutoV2
24900F34F8

Summary

This workflow was made as an experiment to compare various technologies supporting "image to video". In fact, it allows comparing the following four technologies, starting with a single image, and compositing them in one single video:

  • AnimateLCM

  • SVD XT 1.1

  • ToonCrafter

  • FILM Interpolation

Given that ToonCrafter and FILM Interpolation both need at least two frames to produce decent results, the last frame produced by SVD is used in an img2img pass with IPAdapter and ControlNet to keep the style and composition consistent, while fixing some of the "burn in" effect caused by SVD.

This workflow is mostly designed for SDXL (except for AnimateLCM), but can be easily tweaked for SD1.5 as well.

The workflow will output all the intermediate videos as well as the final composited result in a subfolder marked with today's date in a folder named vidcmp in ComfyUI's output folder.

It was tested on a RTX 3090 with 24GB of VRAM.

How to use

Base Image

This is the starting point of the workflow. It is where you select your base model, loras, image resolution, etc, and also compose your positive and negative prompts. Here are some example base images generated by the workflow:

AnimateLCM

This part of the workflow generates the AnimateLCM video. It uses a sparse scribble ControlNet as well as an IPAdapter Tiled to make sure the generated video sticks to the original image. Consider decreasing the weights if you want to give it more freedom. There's also a Mutival Dynamic node you can play with to increase or decrease the motion.

There are two sampling passes in this group, as well as a final upscaling, to bring back the video in at the original resolution.

SVD XT 1.1

This group generates a video using the SVD XT 1.1 techology, and also applies an upscaling pass.

LastFrameFix

The sole purpose of this group is to take the last frame of the SVD video, and re-renders it using img2img, to improve its quality, as SVD videos tend to suffer from overdeformation and burning towards the end.

The group uses two ControlNets (Depth and Canny), as well as an IPAdapter to keep the style.

There's a comparer node in the graph you can use to compare the images before and after being improved by the group.

ToonCrafters

This group generates the ToonCrafter video. If you hit out of memory errors, try rerunning the workflow, or reduce the resolution of the image in the node named "ImageScaleToTotalPixels".

It takes two images as input: the base image created at the beginning of the workflow, and the improved last frame from SVD XT.

Moreover, ToonCrafters is the only of the four techs that can only generate a fixed number of frames, at 16 frames. The workflow will repeat the batches, in loop, to match the number of frames output by AnimateLCM and SVD XT.

FILM

This group is responsible for the FILM video frame interpolation. It takes two images as input: the base image created at the beginning of the workflow, and the improved last frame from SVD XT.

Final Composition

These groups will create the 2x2 video comparing all four img2video techs, and also append all video together in a single final video.