Type | VAE |
Stats | 761 10,713 |
Reviews | (39) |
Published | Mar 24, 2024 |
Base Model | |
Hash | AutoV2 34A4AB128B |
https://huggingface.co/thomaseding/vae-teding-aliased-2024-03
(If you want to use this with my PixelNet model, use checker squares that are each multiples of 8 in width and height.)
Stable Diffusion 1.5 fine tuned VAE decoder for better pixel art generation by aliasing the output of the decoder. The quality of this VAE hinges on the pixel art model being trained on "tiles" that are multiples of 8. Otherwise you will get seam-like artifacts.
Fine tuning was done by training 50 thousand images for 1 epoch effective batch size 12. I preprocessed the images to quantize each 8x8 tile to its average color. On a RTX3090, this took about 4 hours to fine-tune. Used only MSE loss at 1e-5 learning rate. The training data set was just generated from other stable diffusion models, mostly cartoon-like images.