Sign In

Flux Blockwise

45
542
17
Verified:
SafeTensor
Type
Checkpoint Trained
Stats
305
0
Reviews
Published
Nov 28, 2024
Base Model
Flux.1 D
Training
Steps: 11,111,111
Epochs: 1,111
Hash
AutoV2
C0B540CFD7
Reactions - 29478
29.5k
Downloads - 274945
274.9k
Generations - 2032723
2m
SDXL Training Contest Participant
The FLUX.1 [dev] Model is licensed by Black Forest Labs. Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs. Inc.
IN NO EVENT SHALL BLACK FOREST LABS, INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.
Please support civitai and creators by disabling adblock

Flux Blockwise (Mixed Precision Model)

I had to build several custom tools to allow for the mixed precision model, to my knowledge it is the first built like this.

  • Faster and more accurate then any other FP8 quantized model currently available

  • Works in Comfy and Forge but forge needs to be set to BF16 UNET

  • Comfy load as a diffuser model USE DEFAULT WEIGHT

  • FP16 Upcasting should not be used unless absolutely necessary such as running CPU or IPEX

  • FORGE - set COMMANDLINE_ARGS= --unet-in-bf16 --vae-in-fp32

  • Other then the need to force forge into BF16, (FP32 VAE optionally) it should work the same as the DEV model with the added benefit of being 5GB smaller then the full BF16

It turns out that every quantized model including my own up to this point to my knowledge has been built UN-optimally per blackforest.

Only the UNET blocks should be quantized in the diffuser model, also they should be upcast to BF16 and not FP16 (Comfy does this correctly)


Hippo Image remix

Lion Image remix

I am currently trying to workout how to follow Blackforest recommendations but using GGUF

Discussion

AbstractPhila's Avatar
AbstractPhila
Gold Tier Supporter Badge January 2025

The testing shows some interesting outcomes.

Can you modify the blocks of DeDistilled?

RedPinkRetro's Avatar
RedPinkRetro
Rendered Romance Contest Participant

I found the differences between regular Clip L (~300mb I had lying around) to the Clip L Large BF16 noticeable and would prefer the BF16 version.
Clip L Large BF16 had more detail, higher contrast and better accuracy in small details like eyes, pupils, reflections of jewelry.
However I still prefer using the finetuned Clip L from zer0int.

The T5xxl BF16 version showed no difference compared to the FP16 version in my test, and only shaves off less than 300mb.

METAFILM_Ai's Avatar
METAFILM_Ai
St Patrick's Day Badge

That's a really advanced idea! It not only ensures high accuracy in the parameter layer, but it also manages the overall model's computational power consumption.

AbstractPhila's Avatar
AbstractPhila
Gold Tier Supporter Badge January 2025

I'll build the lora stack onto it and see what SimV4 looks like.

AbstractPhila's Avatar
AbstractPhila
Gold Tier Supporter Badge January 2025

I have an idea. Generate a few hundred pictures and heatmap the block access. For the lesser used blocks; quantize to a smaller size, prune, and compact. I wonder if that would lobotomize it, speed it up, or slow it down.

See the idea here is, we aren't actually using those blocks for much, which means we're potentially loading a large amount of ram for potentially nothing, causing a reliance on cpu switching. If we spend as little time as possible in those blocks, we can potentially eliminate their need entirely through pruning.

atlazisco390's Avatar
atlazisco390

I am just started using comfyui, but I dont know where should I put the t5xxl file. I used the load clip and the load diffusion model, but I dont know what to use for the t5xxl. Can someone help me? Thank you in advance!

PR
ProjectDreamer
ODOR Badge

Is this T5XXL the Flan one? Thank you in advance!

LiteSoulHD's Avatar
LiteSoulHD

"I am currently trying to workout how to follow Blackforest recommendations but using GGUF".

Waiting for this! Even better if it's an easy process to share to the community, thanks.