Sign In

Saving space; How to shrink larger models (Make any model fp16)

Saving space; How to shrink larger models (Make any model fp16)

[WIP] Guide still in progress, but it should work. Feel free to leave constructive criticism. :)

TL;DR: How to turn a fp32 model into a fp16 model. Don't wanna read this? Skip to "How: Shrinking".


Sales-pitch:
Have you found a cool looking model, only to be greeted with a 6 GB file that'll make your harddrive, RAM, or GPU cry?
Do you want to store your models somewhere else than your A1111 directory?
Or do you want multiple models to use the same VAE without having each copy take up additional space?

Well, I've got good news for you!

Technical:
How to convert a float32 model to a float16 model. and strip away unnecessary data (such as data for further training).
How to have multiple directories be checked for models by A1111.
How to have identical VAEs share the same space in storage.

Terminology:
Model: In this case referring to a .CKPT (checkpoint), or .SAFETENSORS file. The thing that does the Stable Diffusing. Usually sized 2~6 GB.
VAE: The thing that translates Latent Data into Pixel Data. Aka, colour-picker.
Float: A way a computer can store a not-whole number.
Float16 / fp16: Uses 16 bits (2 bytes) to store a number. Fast, but (very slightly) less accurate.
Float32 / fp32: Uses 32 bits (4 bytes) to store a number. Slower, but more accurate.
EMA: Data used when training a model. Useful when you do training, Otherwise completely useless.

Why / Effect:
Q: Why would you convert a perfectly normal model to a smaller model?
A: Well, you don't have to, but if you have multiple models -that are several GB in size- on your drive, that adds up fast. And especially if you're not using them all at the same time. So making them ~50% smaller, and cutting off the unused parts, can be a great way to save some space. (And it also speeds up loading time!)

Q: Won't that affect image quality?
A: You'd think so, but surprisingly enough, no*
* = Yes, but a nearly imperceptibly tiny amount (at least for Stable Diffusion purposes).
Also, depending on your GPU and A1111 settings, you might already be using the model in fp16 mode, it just gets converted during loading.

Disclaimer:
The shrunken model has a different hash, therefore CivitAI won't/might not tag it as a used resource when uploading a generated image! But the name is still stored in the PNG generational data, so a human can still know which model is used.

How:
Shrinking:

  1. Install extension(s)
    Converter: https://github.com/Akegarasu/sd-webui-model-converter.git
    Analyzer (optional): https://github.com/arenasys/stable-diffusion-webui-model-toolkit.git

  2. Find a model you wish to shrink, and place where you usually put your models. (Larger than 2 GB, as that's the "minimum" size, for demonstration purposes I will be using AngrAFlex 2.0.)

  3. (Optional), check what parts of the model are taking up space.
    3.1. Go to the Toolkit tab in A1111. Select model you downloaded, and hit Load.
    3.2. It'll show several stats. You can also change stuff here, hit Save, and it'll export the new model, but for this guide I'm just using it to show size-difference.
    The stats of interest here are "junk data" which is EMA (data used for training, and therefore not needed during generation). And "Wasted on precision" which is how much space is currently used to store more precise numbers.
    In the example case, the model as a whole is 5.55 GB, 1.60 GB of which isn't actually used during generation. and 1.97 GB used to store 32 bit floats. Which when removed, leaves us with a model of ~1.99 GB. A great space saving!
    3.3. When done, hit clear to free the model from RAM.

  4. Actually shrinking it.
    4.1. Go to the Model Converter tab in A1111. And select the model you wish to change.
    4.2. Give the new model a name (Custom Name (Optional)), I'd recommend "NAME_fp16".
    4.3. For the settings:
    Precision: fp16. Pruning method: no-ema*
    Show extra options [x]
    unet: convert. text encoder: convert. vae: convert
    others: convert
    * I think in a previous version no-ema and ema-only were swapped, if the resulting model only generates giberish, select the other one, re-shrink the mode, and re-generate, and it should work. :)
    4.4. Hit Run, wait a few seconds, a bit of RAM and SWAP will be used, and out should pop a shrunken model! (Located where your models normally are.)

  5. Select the model to generate with, and test.
    5.1. If it doesn't work, see Issues/help. If it still doesn't work, please share the error message in the comments. :)

Store elsewhere:
[WIP]
Windows search "Powershell", Open as Admin
`cd {A1111/models/stable-diffusion}`
`new-item -itemtype symboliclink -name {name_local} -value {path_remote}\{name_remote}`

Shared VAEs:
[WIP]

Issues / help:
Q: Shrunk model only produces garbage!
A: Reshrink it with the other option for pruning method [4.3].
Q: Shrunk model only produces a black square, or I get --NaN VAE warnings.
A: Use a different VAE, or reshrink it with "vae: copy" [4.3]

Versions used:
A1111: v1.5.1
Model Converter: 9ab009e2
Toolkit: 4d8fea77
OS: Win 10 Pro 64x

10

Comments