Sign In

Your CLIP is forced in FP32 - Why That Matters

10
Your CLIP is forced in FP32 - Why That Matters

Your CLIP is forced in FP32 - Why That Matters

Why is clip forced in FP32?

  • By default all major GUI use the CPU for handling CLIP

  • Forge Users currently the GUI is forcing GPU use and ignoring most user-line commands

Why does this matter

  • When the CLIP is loaded it has to be up-cast to FP32 and stored in RAM

  • An up-cast FP16 CLIP should have just been FP32 your loosing precision at the cost of 200MB for CLIP-L and 1.25GB for CLIP-G

What Should You Do?

  • Option One: Merge FP32 CLIP Models
    You can merge FP32 CLIP into your favorite model, this increases model size but ensures better accuracy across the board. - This will not decrease IT's per second unless you have very low system ram.

  • Option Two: Force the GUI to Use VRAM


If you have sufficient GPU VRAM, you can configure the system to process CLIP in VRAM, if your going this route I would use BF16 for your CLIP

  • Forge: --always-gpu (As of 12/30/2024 Forge Acts like this command is in use)

  • Comfy: --gpu-only

Specific Commands for CLIP Optimization when on CPU

These commands ensure torch doesn't take your FP32 CLIP convert it to FP16 and up-cast it back to FP32 (Yes it can do that)

  • Forge: --clip-in-fp32

  • Comfy: --fp32-text-enc

Petition your favorite model creators

Petition your favorite model creators to at minimum make a mixed precision model with the FP16/BF16 UNET and CLIP/VAE in FP32 - This is simple to do and has no downside.

The CLIP model included with nearly all models on site is FP16. This was to reduce the model size and keep everything in VRAM. However this was before Torch updated its handling and allowed auto-casting and mixed precision models. So we can have a FP8, UNET with a FP32 CLIP (And even mixed precsion blocks in the same model)

10

Comments