Hunyuan Video

Hunyuan Video is by Tencent

This article links to Comfy native files and not the files for the Kijai nodes.

UNET

Even 4090 users will end up block swapping to the CPU if the BF16 model is used. The FP8 model is recommend for most users.

CLIP-L

Tencent recommends the full vit/vision model but COMFY-UI does not seem to support this with the current version. (4/11/2025)

You can certainly use a full CLIP-VIT-L model but it is likely not being utilized. (Kijai nodes might)

Personally I find the choice of CLIP-L doesn't matter to much, so don't be overwhelmed by the large list.

Pruned CLIP-L -- FP32 Versions

Full Vision Models -- FP32

Zer0Int Detail+VIT -- I personally had the list glitches in videos with this model
Base Vit
Zer0Int Gated-Balanced
AbstractPhil Sim4+Vit

TE (llava-llama-3-8b)

Uncensored FP32

Uncensored BF16

For Comfy UI users this is quite important DO NOT USE HIGHVRAM FLAG if this is set you will force the CLIP to be in GPU.

For high system ram users I would use the FP32, all CLIP/TE will be loaded in FP16 by default unless FP32 Flag is set it will look like this:

CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16

Other then a much longer initial load time using FP32 vs FP16 will not cause a speed difference on the CPU, in many cases FP16 is slightly slower then FP32.

For FP32 use:

--fp32-text-enc

The output will be:

CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float32

Using a Scaled FP8 TE is not recommended unless you have 24GB of VRAM and are using the HighVram Flag Normal Users DO NOT USE THIS FLAG

The reason being is that 4090 users have a GPU that can handle FP8 faster then other cards. With the 24GB of VRAM they could load all the models in FP8 mode into VRAM without needing block swap or offloading

--highvram

VAE

In most cases BF16 VAE would be most appropriate. COMFY will downcast the FP32 VAE unless the FP32 VAE flag is set. I do not recommend the FP8 VAE.

To use FP32 VAE without being downcast the following argument must be passed:

--fp32-vae

Example .bat file

The sage attention flag can be set for 8bit attenuation.

@echo off
call .\venv\Scripts\deactivate.bat
call .\venv\Scripts\activate.bat

cmd /k python main.py --fp32-text-enc --fp32-vae --bf16-unet --use-sage-attention

Hunyuan Video: What model to use for 8GB-24GB

Hunyuan Video

UNET

CLIP-L

TE (llava-llama-3-8b)

VAE

Example .bat file

Comments