Sign In

FLUX Blockwise - We have been building models wrong?

8

Nov 28, 2024

(Updated: 7 days ago)

announcement
FLUX Blockwise - We have been building models wrong?

Flux Blockwise

I had to build several custom tools to allow for the mixed precision model, to my knowledge it is the first built like this.

  • Faster and more accurate then any other FP8 quantized model currently available

  • Works in Comfy and Forge but forge needs to be set to BF16 UNET

  • Comfy load as a diffuser model USE DEFAULT WEIGHT

  • Despite being larger the model works faster when used properly.

  • FP16 Upcasting should not be used unless absolutely necessary such as running CPU or IPEX

  • FORGE - set COMMANDLINE_ARGS= --unet-in-bf16 --vae-in-fp32

  • Other then the need to force forge into BF16, (FP32 VAE optionally) it should work the same as the DEV model with the added benefit of being 5GB smaller then the full BF16

Every FLUX model that is not BF16 should be mixed precision.

This is similar to the scaled FP8 E4M3 and E5M2 scaled T5_xxl only even more critical.

It turns out that every quantized model including my own up to this point to my knowledge has been built UN-optimally per blackforest.

Only the UNET blocks should be quantized in the diffuser model, also they should be upcast to BF16 and not FP16 (Comfy does this correctly)

I am currently trying to workout how to follow Blackforest recommendations but using GGUF

The blocks that should be BF16 and never quantized

All Text Blocks with (Txt)

'time_in.in_layer.bias',

'time_in.in_layer.weight',

'time_in.out_layer.bias',

'time_in.out_layer.weight',

'txt_in.bias',

'txt_in.weight',

'vector_in.in_layer.bias',

'vector_in.in_layer.weight',

'vector_in.out_layer.bias',

'vector_in.out_layer.weight'


Image Remix of Lion image

8