Blockwise NF4 (Full Checkpoint - 22GB)
In forge use Automatic FP16 Lora not NF4 or NF4 Automatic
Recommend to use for FORGE set COMMANDLINE_ARGS= --unet-in-bf16 --vae-in-fp32
Full checkpoint DO NOT LOAD additional TE, VAE or CLIP
NO changes have been made to the Blackforest base diffusion model, other then mixed precision quantization.
This model is likely the first of its kind combining the NF4 quantization with Blackforest recommendation to not quantize TE blocks.
High accuracy and speed while still fitting under 24GB (Works well in 16GB and 8GB cards also)