Type | |
Stats | 3,808 |
Reviews | (249) |
Published | Dec 18, 2024 |
Base Model | |
Training | Steps: 1,111,111 Epochs: 11,111 |
Hash | AutoV2 4BD21395B0 |
Hunyuan Video
Converted to safetensors.
Llava Llama was serialized and 4 Blocks where converted to FP8 (Kijai has not added local TE support nodes yet, hit them up as some of us don't trust forced downloads.)
VAE was converted to Safetensors
For CLIP-L they recommend using the full vision model.
NOTE When using the FP8 Model use FP8 Scaled, for INT4 BF16 should likely be used.
Please note the TE is untested but VAE and UNET confirmed working. (ON MY CPU)
It is possible that the Vision elements of the TE could be pruned as well as the Vision elements of the CLIP.
The image was rendered using my CPU taking 90 seconds per IT at scaled FP8
The BF16 model is faster at INT4 at 40 seconds per IT but still not viable for rendering video CPU only. Lets hope for NF4 or Q4 models for us 8GB users.