Sign In

Hunyuan YAW (Yet another workflow) Easy T2V I2V(SkyReels) V2V audio, random-lora, preview pause, upscale, multi-res, interpolate,prompt save/load,teacache,new interface, Fast

142
2.8k
58
Type
Workflows
Stats
318
Reviews
Published
Feb 23, 2025
Base Model
Hunyuan Video
Hash
AutoV2
387654BE08
default creator card background decoration
TH
thejsn
Tencent Hunyuan is licensed under the Tencent Hunyuan Community License Agreement, Copyright © 2024 Tencent. All Rights Reserved. The trademark rights of “Tencent Hunyuan” are owned by Tencent or its affiliate.
Powered by Tencent Hunyuan

** sorry interpolation was set to 24 instead of 48 frames. This would cause the interpolation render video to play twice as slow. I fixed it and updated the download. (Please download again)

NEW in V6.2! Major overhaul. Significant interface changes. Dual Randomized Lora stacks with Triggers/Prompts and Wildcards. Amp up your over-night generation runs! Prompt Save/Load and more. Face Restore. Audio generation has been improved, Stand-alone Audio Generation, T2V, I2V via SkyReels, GGUF support, use system ram as VRAM.

Read full instructions below for more info.

Workflow highlights:

  • Audio Generation - via MMaudio - Render Audio with your videos, Stand-alone plug-in available for audio only post processing.

  • FAST preview generation with optional pausing

    • Preview your videos in seconds before proceeding with the full length render

  • Lora Randomizer - 2 stacks of 12 Loras, can be randomized and mixed and matched. Includes wildcards, triggers or prompts. Imagine random characters + random motion/styles, then add in Wildcards and you have the perfect over-night generation system.

  • Prompt Save/Load/History

  • Multiple Resolutions

    • Quickly Select from 5 common resolutions using a selector. Use up to 5 of your own custom resolutions.

  • Multiple Upscale Methods -

    • Standard Upscale

    • Interpolation (double frame-rate)

    • V2V method

  • Multiple Lora Options

    • Traditional Lora using standard weights

    • Double-Block (works better for multiple combined loras without worrying about weights)

  • Prompting with Wildcard Capabilities

  • Teacache accelerated (1.6 - 2.1X the speed)

  • All Options are toggles and switches no need to manually connect any nodes

  • Detailed Notes on how to set things up.

  • Face Restore

  • Text 2 Video, Video 2 Video, Image 2 Video

  • Fully tested on 3090 with 24GB VRAM

This workflow has a focus on being easy to use for beginner but flexible for advanced users.

This is my first workflow. I personally wanted options for video creations so here is my humble attempt.

Additional Details:

I am new to AI and comfy, this is my first workflow I loved the "Hunyuan 2step t2v and upscale" workflow - https://civitai.com/models/1092466/hunyuan-2step-t2v-and-upscale and it was highly used as the base. So should work on the same set-ups as the original.

** Troubleshooting nodes, or comfyui manager can be found at the bottom of this document.

Quick Start Guide:

By Default- everything has been tuned for a functional workflow, according to the Hunyuan 2step t2v and upscale workflow.

This is the workflow..

Step 0. Setup your models, in the Load Models section, Select your resolution.. Recommend either Resolution 1 for 16x9, or Resolution 3 for 3X4. These are the best starting resolutions depending if you want landscape or portait. 16x9 in general is more versitle, 3X4 may be higher quality but limited in video length, all depending on your GPU VRAM. Start with default steps, and video length.

Step 1. Render a preview low res model, check to see your loras/motion prompt is working

Step 2. Pause and decide based on the preview if you want to continue to the full render

Step 3. It uses the low quality render as an input to guide a better mid quality render. This will double the resolution that has been selected.

Step 4. uses a frame by frame upscaler that doubles the resolution again.

Step 5. Doubles the frame rate from 24 fps to 48 fps for more smoother motion.

(Optional Step) Enable MMaudio Generation - it will create audio to go along with your video, using both your text prompt and video to determine what sounds to add. Describe the sounds in the scene in your text prompt for better generation. This uses more VRAM so has been disabled by default. You can always add audio generation at the end using the Standalone MMaudio Plugin.

From here you can start to adjust things like # of steps, video length, and resolutions to find the best balance of what your available VRAM can handle.

All toggles and switches:

Make sure you only choose 1 Method in Step 1.

* These are the default settings.

You should never need to rewire anything in this workflow. Detailed instructions and comments right inside the workflow.

V2V - Video to Video:

Enable it on the Control Panel:

You can use video as an input or guide for your video. Enable this option in Control Panel and click to upload your source input video. Please note that this will use your selected resolution for output.

To adjust the similarity to your input video, adjust the Denoise in the main Control Panel. Lower (0.5 - 0.75) will provide closer similarity to your input video, where a higher number will get more creative.

I2V - Image to Video (Experimental)

This method uses SkyReels for I2V, I wasn't originally going to support this but a few users requested the feature. I will not be providing support or fixes for this, this is simply a placeholder until the official Hunyuan Model is released. I've taken some liberties to try to get the most out of SkyReels.

Enable it in the Main Control Panel, Then Setup the Models Here:

You must set your Skyreels Models here: https://huggingface.co/Kijai/SkyReels-V1-Hunyuan_comfy/tree/main

I also recommend downloading the Skyreels-i2v-smooth-Lora: https://huggingface.co/spacepxl/skyreels-i2v-smooth-lora/tree/main

If you don't want to use the smooth lora, simply right-click on the box and choose "bypass" it will turn pink when deactivated.

Used Load image to load your source image. The image will be appropriately scaled as to not break this plug-in. The output resolution will use your Selected Resolution from the Resolution Selector.

I2V Method 1: One-Pass to upscale/interpolation/audio

The primary way to use this method, is to Disable 1a,1b,3 in the Main workflow, it will take your image as input, Then render the video at 2X your selected resolution, then send it to step 4/5 for upscale and interpolation. For example if you selected Resolution 3 (3x4) 224x320 - Resoluton Scale (2X) will render the video at 448x640, after this it will go to the upscaler and double again to 896x1280. This yields pretty good results. You can adjust the Image Scaling with this setting. Default is 2X. Don't go too crazy or you will run out of VRAM.

I2V Method 2: Two-Pass to upscale/interpolation/audio

Disable 1a,1b in the Main workflow. The Video will render at your selected resolution using SkyReels Model, then be sent to Stage 2 to be re-rendered using the Hunyuan Model. This will use more VRAM, and since Stage 2 normally doubles resolution with an upscale latent factor of 2, you need to be mindful of your resolution scaling. I recommend Setting Resolution Scale to 1.0 or 1.5 to start if you want to use this method. You will definately want to reduce the denoise to something much lower 0.5-0.7 to maintain the similarity to the original image, otherwise it will get too creative and not keep core elements for your image. In theory this is the same as using Method 1, and then just automatically doing a V2V on the resulting video. This just does it in one flow at the expense of VRAM, since 2 models will load into memory.

** Please note I2V is experimental and not perfect, this will be replaced when the official Hunyuan I2V is release.

Selecting your Model (LOW VRAM options):

Although I have been testing on a 24GB VRAM setup, I've had a lot of users request help with Lower VRAM. I have not tested, so I'm hoping some of these features will help them out.

Load your standard BF16/FP8 or FP8 Models into "1. Load Standard Diffusion Model"

Load your GGUF models into "2. Load Model GGUF (MultiGPU/System Ram as VRAM)"

From what i understand the GGUF models can take a bit longer but really save on VRAM based on the model chosen.

Use the Green Selector box to choose the model you wish to use in your workflow.

For VRAM savings, set "device" to CPU in the DualCLIP Loader - if you don't see the option, right click on it, choose "Show Advanced" and it should show up.

If using the GGUF model, can you set "use_other_vram" to "true" this will allow you to use system ram as VRAM and hopefully prevent some of the OOM errors. You can set the amount of virtual VRAM to use above. Please note anytime you are using system ram, the render time will be much slower, but at least your production won't stop.

** I also noticed that there is a 24GB sized GGUF model - does anyone know if this is as good as the BF16 model? I don't want to sacrifice quality, but would love to use the virtual VRAM feature. if anyone knows, please let me know in the comments.

Lora Options:

Traditional Loras and Double Block can be used, Double-Block is default.

Double-block seems to do better with multiple lora's without having to worry about adjusting weights as often.

The Main Lora Stack is a standard additive Lora tree. Add or combine up to 5 different Loras, set your all, single_blocks,double_blocks according to the loras you are using. You can run these loras in addition to the random loras. Add a styles in the main lora section, then add randomized character loras, with randomized character animations.

Enable/Disable your lora's by right-clicking and choose "Bypass"

Resolution Options:

Select from 5 common resolutions - or edit an additional 5 custom resolutions to make them your own. Change resolution with the "Resolution Selector" By Default the fastest smallest resolution is selected intended to go to the next V2V portion of the workflow. As you increase to the larger resolutions your render times will take much longer.

Pause after Preview - (On by default)

Video generation takes too long, experimenting with multiple loras or getting your prompt right takes so long when video render-times are slow. Use this to quickly preview your videos before going through the additional time to upscale process. By Default this is Enabled. After your start the workflow, it will quickly render the fast preview, then you will hear a chime. Make sure you scroll to the middle section beside the video preview for next steps.

Upscale the previews you like, or cancel and try again!

1) Continue the full render/workflow - Select ANY image (It doesn't matter which one) and click "Progress Selected Image"

2) Cancel - click "Cancel current run" then queue for another preview.

To disable this feature, toggle it off under "Options Selector"

MMaudio - Add audio automatically to your video

By Default it will only add audio to upscaled videos. However there is a switch to enable it for all parts of the render process. Be sure to add any audio details to your prompt for better generation.

** Note MMaudio takes additional VRAM, you may need to balance video length quality when using MMaudio. A Stand-alone plugin is available in v5.2 and you can add your audio after you have finalized your video in the main workflow. This lets you maximize the quality and video length according to your vram, then simply add audio as an additonal step in post-processing. Using Standalone offers additional flexibility for you to generate multple times to get the perfect audio for your video.

Interpolation after Upscaling

This option lets you double the frame-rate of your rendered video, by default "enabled"

you an disable it from the "Options Selector". This can slow down your render if you don't need it.

I feel the need, the need for Speed

Things running too slowly? you can increase the Teacache Speed up to 2.1X at the sacrifice of minimal quality. Default is Fast (1.6X). Please note there are 2 Teacache Sampler nodes.

T2V - Text to Video - Prompting and Wildcards

Please enter your prompt int he Green "Enter Prompt" Node. *** Please ensure your prompt doesn't have any linefeeds or new lines or it will change the way the system processes the workflow.

Using Wildcards is a feature that can allow you to automatically change your prompt or do overnight generation with variances. To create wildcards. you will need to create a .txt file in the folder /custom_nodes/ComfyUI-Easy-Use/wildcards. Create a wildcard on each line. Pressing enter to separate each wildcard. You can use since words, or phrases. As long as they are separated by a "enter" Do not double space. Here are 2 example wildcard files.

color.txt

red

blue

green


locations.txt

a beautiful green forest, the sunlight shines through the trees, diffusing the lighting creating minor godrays, you can hear the sound of tree's rustle in the background

a nightime cityscape, it is raining out, you can hear the sound of rain pitter patter off of the nearby roofs

a clearing in the forest, there is a small but beautiful waterfall at the edge of a rockycliff, there is a small pond and green trees, the sound of the waterfall can be heard in the distance, birds are chirping in the background


To use these in a prompt you can click the "select to add wildcard" and add them at the appropriate spot in the prompt.

ellapurn3ll is wearing a __color__ jacket ,she is in __locations__.

Full details on this custom node can be found here: https://github.com/ltdrdata/ComfyUI-extension-tutorials/blob/Main/ComfyUI-Impact-Pack/tutorial/ImpactWildcard.md

Randomized Lora and Triggers

Turn up your overnight generations by using both wildcards and Randomized Loras.

Choose up to 12 Random Loras to mix and match. Please note by default only the first 5 are enabled. Change the Maximum in the appropriate setting to set the number of loras you have configured. It will always count from top to bottom. So if you only want to randomize between 3 loras, set the "maximum" to 3. And fill in the information for the top 3 Loras.

** Very important. In order for the trigger words to populate, you must include the text:

(LORA-TRIGGER) or (LORA-TRIGGER2) in your prompt field. It will then automatically fill in the value when generating with random loras. This is case sensitive so please becareful.

Please note you can put full prompts, single triggers, or trigger phrases, and it will be filled in automatically for you.

To add wildcards to this use {} bracets and | separator. for example. She is wearing a {red|green|blue} hat. Or you can do full prompts {she is standing in time square blowing a kiss|she is sitting in a park blowing a kiss}


Load and Save your favorite prompts with "Prompt Saver"

As you run your workflow. it will automatically populate your Prompt Saver with the latest prompt. You can then Save it for later use. Please note to Load and use a prompt.. select your previously saved prompt, and click "Load Saved". But its important you must toggle "Use Input" to say "Use Prompt" in order to use your loaded prompt. Don't forget to switch it back to "Use Input" for normal prompt use.

**Default is "Use Input" This means your prompt will be generated by the normal input wildcard prompt field, and simply show the prompt data in Prompt Saver.

One Seed to Rule them All:

One single seed handles all lora randomization, wildcards and generation. simply copy and reused your favorite seeds with the same sets using random loras and wildcards without a worry.

* Tip clip the recycle button to re-use your last seed. Did you want to fine-tune or tweak a video you just recreated? did you get an OOM in the 2nd stage, Use the last seed, adjust and try again!

Standalone MM-Audio:

To maximize your quality and video length you may want to disable MM-audio in the main workflow, then add the audio in later on in post-processing. This plug-in is meant to run as a stand-alone to add audio later.

Enable MMAudio - Standalone and disable all other parts of the workflow.

Simply upload the video you want to add audio to. All calulations are done for you. Its recommended to use a blank/empty prompt. But i've included the prompt saver if you want to load your previously saved prompt

(Optionally) you can can enhance your prompt focusing on describing the sounds or the scene as it relates to sound.

Generate as many times as you want until you get the sound just right!

Standalone Upscaler and Interpolation:

Want to just upscale or interpolate your existing video files? Just upload them, Disable all other parts of the workflow except upscaler and interpolation.

The upload box needs to be enabled in the appropriate place.

Click on Enable to "Yes" to use this feature. Don't forget to turn this off when using the regular workflows. By default these should both be disabled.

Tips for how to use the workflow

Fast, V2V method, Lora (Default) - this follows the original work flow.

  • Quick Preview using a fast low res video: Use

    • (Resolution 1 - 368x208) for landscape video

    • (Resolution 3 - 320x416) for portrait video

  • Pause to decide if you want to do the full render

  • Full render includes audio, upscaling and interpolation

T2V straight to upscale method

  • Render at medium or high res, then use the upscaler, audio

  • Use Resolution 2, 4, 5 options for higher res - slower render. I personally use resolution 4 for most generations

  • Will Upscale and add audio

Setup

Disable "Intermediate V2V" from Options menu

Select input 2 on Upscaler Select

Increase "steps" to 25 or higher in BetaSamplingScheduler in the Main Window.

Increase the Quality of your Generations

Increase the number of steps:

For the default V2V Method. In the Control Panel (Settles). Increase your steps from 24 to 35 or higher (up to 50). Each step takes more time and memory so find that balance between resolution and steps.

For the Main Render Video preview, or bypassing intermediate. Increase your steps from 12 to something much higher. ie. 30/35

For the best quality run 35 steps or higher, and reduce both teacache nodes (main/intemediate) to "Original 1X" instead of "Fast 1.6"

Try higher resolutions:

Change my resolution to one of the Large higher res. ie Resolution 2 or 4. Then Queue. For the most part, I get the same video only in a higher res. Please note that by default the Large resolutions are double the res of the small counterpart this helps maintain consistency when using this method. Please always use the same aspect ratios ie. 1--> 2, 3--> 4

Balance your video length and quality for that perfect video

Here are a few settings i've used which strike a balance for video length and quality. Tested on 3090 24 GB Vram.

Longest video length (16x9):

Use Resolution 1, video length to 201 frames, Set I2V Intermedia steps to "23" or "24" in basic scheduler, set scheduler to "beta". Disable MM-audio, use upscalers and interpolation.

For longer videos setting the Main Video Render to 15 steps is recommended for better guidance.

** Protip 201 frames is the max hunyuan video size, and will often create a perfect loop at this length.

High quality (16x9) (3x4)

Use Resolution 1 or 3, video length to 97 frames, Set MainVideo Render BetaSamplingScheduler steps to "15", Set I2V intermedia steps to "35" in basic scheduler, set scheduler to "beta". Disable MM-audio, use upscalers and interpolation.

Balanced with Audio (16x9, 3x4)

Use Resolution 1 or 3, video length 73 or 97 frames, Set I2V intermediate steps to "28" in basic scheduler, set scheudler to "beta". Enable MM-audio, use upscalers and interpolation.


Troubleshooting:

Nodes missing:

MMaudio - If your audio nodes aren't loading. Please go to ComfyUI Manager, and do an "install via Git URL" with this address: https://github.com/kijai/ComfyUI-MMAudio

then restart.

If you get a security error, you will need to go to: ComfyUI/user/default/ComfyUI-Manager and look config.ini open that with notepad and look for "security_level = normal" change this to say "security_level = weak". Then try the install. Once you have installed, you can set the setting back to normal. any additional MMaudio information can be found on their github page.

UnetLoaderGGUFDisTorchMultiGPU missing.. Search in comfyui manager for "ComfyUI-MultiGPU"

You must also have "ComfyUI-GGUF" installed in your comfy UI. Please make sure these are both searched and loaded using comfyui Manager.

If this doesn't work you can try to go to comfyui Manager, "install via Git URL" with : https://github.com/pollockjj/ComfyUI-MultiGPU

If you get a security error, you will need to go to: ComfyUI/user/default/ComfyUI-Manager and look config.ini open that with notepad and look for "security_level = normal" change this to say "security_level = weak". Then try the install from comfyui manager.

As a last resort if you want to completely disable MultiGPU (not recommended).. Go to the "Load Model section. make sure your green selector switch "Diffusion Model" is set to 1.. Then simply delete the node called "2. Load Model GGUF(multiGPU/System Ram as VRAM), Everything will run fine, you will just lose the option to use GGUF and its VRAM optimizations.

Delete this node.

ReActor or Face Enhanced Nodes Missing:

if you are having trouble with the Re-Actor node. You can easily remove it. In theory the workflow should work without it since its bypassed by default.

1) got to the RED Restore Faces box.. and double click anywhere in the grey space. search reroute and add the node.

2) Drag the input line from the left side of Restore Faces to the left side of the new node.

3) drag a new noodle from right ride side of the reroute node to the Image input of "Upscale Video" then you can just delete the restore faces node completely.


That's all folks. all credits to all of the original authors of all of this.

Hope you enjoy. Great to be part of such an open and sharing community!

Feel free to share your creations, settings with this workfow.