FLUX schnell / dev 8 bit, FP8, Q8, GUFF and NF4

From the prediction data, we can see that Apple's M5, M6, and even M7 series computers over the next 3–4 years will not be productive tools for Flux image generation, especially when using merged Flux dev models from Civitai. Additionally, this speed assessment does not account for OpenPose and other extensions. Furthermore, due to their compact design and poor ventilation, overheating from image generation will accelerate the degradation of Apple chips. Fingers crossed that Apple realises this and designs true AI-dedicated and affordable hardware and even their own software similar to Logic and Final Cut.

Speed comparison

This refers to the FLUX schnell / dev model quantized to 8-bit precision. In this context, "8-bit" typically means that the model's parameters are stored using 8 bits per value, reducing memory usage and potentially increasing inference speed.

The 8-bit quantization can be implemented in various ways, with two common formats being FP8 and Q8.

Flux original.

https://civitai.com/models/618692?modelVersionId=691639

Flux  dev 8 bits.

https://huggingface.co/Kijai/flux-fp8

Flux schnell 8 bits.

https://huggingface.co/PrunaAI/FLUX.1-schnell-8bit

https://civitai.com/models/895985?modelVersionId=1003372

QUFF refers to another quantization format.

https://huggingface.co/city96/FLUX.1-dev-gguf

https://civitai.com/models/648580/flux1-schnell-gguf-q2k-q3ks-q4q41q4ks-q5q51-q5ks-q6k-q8

NF4 refers to quantization technique introduced in the QLoRA paper. It's designed to represent model weights with 4 bits, approximating a normal distribution.

https://civitai.com/models/638187?modelVersionId=819165

https://huggingface.co/lllyasviel/flux1-dev-bnb-nf4

Source

FLUX schnell / dev 8 bit, FP8, Q8, GUFF and NF4

Comments