Sign In

IT'S ALIVE! - Fast IMAGE to VIDEO | CogVIDEO-X-FUN1.1 (2B/5B)

81
1.2k
56
Type
Workflows
Stats
225
Reviews
Published
Oct 23, 2024
Base Model
Other
Hash
AutoV2
82A2C1FEFC

thanks to https://github.com/kijai for all his amazing work; none of this would be possible without him. Check out his page and leave him some stars! \m/

BLAZING FAST IMAGE TO VIDEO !

Animate one image or use two images as First/Last frames.

less than 12gb vram !

check older versions for more video examples.

This workflow include different ways to make videos, and is using 2 COG models:

CogVIDEO-X-FUN 1.1 InP models, (5B and 2B)

-2B is faster and good enough to animate a static scene using a single image in "sticky" and "zoom" mode.

-5B is more accurate, works better in first/last mode and creative mode, and allow use TORA custom trajectories if needed

____________________________________________________________________________________________

The potential of this particular COG model went unnoticed, so I had to share it.
I know i know there are many COGs, is confusing. .that's why i'm here.

In short: with this method COGxFun1.1 will generate movements and details according to your prompt while transitioning between two chosen frames.

You can also use a single image:


Different methods are provided and can be selected with a slider:

  • Creative: In this mode, a single image is inserted, and a video is generated. The model has full creative freedom to create an ending.

  • Sticky: In this mode, a single image is inserted, and an image very similar to it, seen from a different perspective, is automatically created to be used as the ending of the video, forcing it to stay as much as possible on the general composition. Everything between the first and last image will be generated by the model. This method is good for creating perfect loops.

  • Zoom: In this mode, a single image is inserted, and the final frame of the video will be the same image but zoomed in. You have the option to choose the amount of zoom through a slider.

  • First/Last: In this mode, two or three images can be inserted, forcing the video to create transitions between them. This mode is very useful for creating controlled animations and works very well with the 5B model.

  • SingleIMG: This mode is similar to the first one (Creative), but it tends to stick on the composition of the input image and be less chaotic at the end.

__________________________________________________________________________________________________

beware when using TORA! each time you change the base resolution you must redo all trajectories!

__________________________________________________________________________________________________

2 Versions of the workflow are included in the zip file.
full version and mini version.

The MINI version do not include a lot of extra stuff, only the essentials.

If you are experiencing issues try use the mini workflow

__________________________________________________________________________________________________

__________________________________________________________________________________________________

### Minimum Hardware Requirements:

12GB VRAM or less for low resolution.

I'm on a 3090, and it never fills more than 16GB, even when experimenting with higher resolutions. it's all about that.

### Render Times:

On 3090 takes from 5 seconds to 2 minutes for each videos, depending on resolution and steps.

I've tested everything about this COG model.

You'll get the best and fastest results using this settins provided in this workflow.
I challenge you to find better settings for how much time I've spent on testing COG 😁

### Infos:

The benefit of using this CogVideoX-fun1.1 from Alibaba instead of the other THUDM/CogVideoX-I2V is for flexibility (it can pick any size and ration) and speed, being able to generate at lower resolution can provide more speed, wich is crucial in VIDEO generations.

So, you can pick any size/ratio images, vertical or horizontal. My tests show that it doesn't seem to matter much.

(Pick two images with the same aspect ratio, obviously, or one will be stretched.)

The only thing that matters is the base resolution. You'll find the dedicated slider (I typically use 512 or lower, but try 768 if your hardware can handle it and you have patience.

### Important Suggestions And Examples:

- To achieve a good, consistent result, then using "First/Last" mode, the two images need to be similar (same location, people, and very close positioning of everything) eg: You can choose to use any images, but you'll get the best results if you use two images that are fairly similar. For example, try pick two screenshots from a random internet video or two 3D characters posed in two different poses. or two similar AI-generated images 🙄

- Stay around 10-15 steps, go higher for better quality. (it's hit or miss below that, although I’ve gotten some nice ones at 5 steps).

- For quick tests, use a lower base resolution (like 320). At that resolution it takes around 10 seconds on my 3090.

- If the results are full of artifacts, switch to "custom prompt only" to avoid auto-prompting and get more stable, consistent animations, by simplify the prompt. Write something simple

(check manual written in the workflow)

Simple prompts like "a person posine, blink, camera shake" or words wiggle, earthquake, lens flares, blink, camera shake, handheld camera have already been tested here with great success. Please share your findings!

- If the video seems too fast for your settings, turn on interpolation in group and raise the "extra interpolation multiplier" or change the video length in the COG settings group according to user manual written right inside the workflow.

### Other Considerations:

I've tested Cog A LOT and changed the values from the standard settings to something I think works better, at least based on my tests.

Feel free to make your own changes (and if you find better settings, please let us know)!

no need to buzz me, i'm fine. ty💗 feedbacks are much more appreciated.

__________________________________________________________________________________________________

*Please note:

do not confuse the different Cog models, as there are many, and they differ from one another.

I suggest taking a look around to understand what the other Cog models are capable of. There is a bit of confusion around, but if you are looking for a way to produce videos quickly, with the most dynamic options, resolutions, ratios, first/last, trajectories then I recommend sticking to this workflow or at least using the models I use in this workflow.

SINCE TORA IS NOW COMPATIBLE WITH THIS XFUN VERSIONS OF COG I ADDED IT IN THE FLOW.

__________________________________________________________________________________________________

V7.0

changelog:

  • fixed broken nodes due to constant updates of stuff around

  • removed some dead nodes

V6.0

changelog:

  • added 4 Tora trajectories

  • More UI controls

  • Better settings, refinements and tips inlcuded

to use tora be sure you are using 5B model (there's a switch to change from 2B to 5B)

then:

1)deactivate "let's Cog" in groups

2)load an image, run

3) setup 4 trajectories (control+click to break the splines in multiple points)

4) activate "let's Cog" and run (the "extend video" will automatically turn on when you activate "let's cog", deactivate it for now untill i figure how to extend this tora mode)

__________________________________________________________________________________________________

V5.0

changelog:

  • Extra Extend ( ability to load a third image to create a video using 3 images)

  • More UI controls

  • Better settings, refinements and tips inlcuded

__________________________________________________________________________________________________

V4.0

changelog:

  • EXTEND now works for all modes (except first last mode)

  • improved UI

+ lora strenght slider

+ seeds management

+ prompt strenght

+ COG frames amount control

+ model selector 2B/5B

+ negative prompts

other changes:

  • efficiency improvements

  • completly revisited chain system

  • some minor fixes

_____________________________________________________________________________________________

V3.0

|changelog|:

improved UI:

  • added lora loader and strenght slider

  • model selector 2B/5B

  • negative prompts

  • some workflow efficiency improvements

other changes:

  • switched to a faster interpolation method

  • some minor fixes

  • updated user's manual

_____________________________________________________________________________________________

V2.0

|changelog|:

  • more refined workflow

  • more options

  • simple ui

  • added User's Manual

____________________________________________________________________________________________

📽️H A V E F U N📽️