Type | Workflows |
Stats | 410 |
Reviews | (15) |
Published | Aug 12, 2024 |
Base Model | |
Hash | AutoV2 C5C8CB5D7A |
NF4 is significantly faster and more memory-efficient than FP8 due to its use of native bnb.matmul_4bit, which avoids casting and leverages low-bit CUDA tricks. It achieves better numerical precision and dynamic range by storing weights in multiple tensors of varying precisions, unlike FP8's single-tensor approach
the list of all (nf4) models (lora not supported yet)
https://huggingface.co/silveroxides/flux1-nf4-weights/tree/main
what you have to do just
go to ComfyUI/custom_nodes/
then (for full checkpoint)
https://github.com/comfyanonymous/ComfyUI_bitsandbytes_NF4
git clone https://github.com/comfyanonymous/ComfyUI_bitsandbytes_NF4
or for unet only
git clone https://github.com/DenkingOfficial/ComfyUI_UNet_bitsandbytes_NF4.git
and then run this workflow in comfyui
"more advance" workflow "just disable checkpoint and activate NF4 checkpoint loader"
for flux resolution
in custom node , or cancel it