Looking for hires fix advice
I'm running an RTX3060 with 12GB of VRAM. Nothing spectacular, but not exactly junk, either.
Anyway, I find when I am doing a 512x768 picture, and I do a hires fix, the best I can do is about a 2x increase. Anything past that, I'm finding myself getting CUDA out of VRAM errors.
Are there any tricks I should be aware of? Is that just my hardware's limits? Is there something I've got set badly?
4 Answers
From my understand, the whole point of hires fix is to fix deformities when you're working outside the range of data the model was trained on. For example, if most of the trained dataset was 512x512 and you tried generating something that isn't the same as that, you can get weird outputs. (extra limbs, head, arms, etc..) because the further away you go from your dataset ratio, the higher it is you'll get weird outputs.
Thus, hires.fix was born. The idea is to hires at a lower value like 1.2-1.5 (or even x2 if you wish), fix issues in inpainting and then upscale using the extras tab. This allows you to x4+ your image.
The good thing about upscaling in the extras tab is you can use two upscalers. I personally use lollypop as my main one and 4x-ultrasharp as the second one with a 0.3-0.4 visibility. (extra upscalers are here and can be placed inside models\ESRGAN.)
Using your example (I have the same GPU btw) I made a 512x768 image, hires.fix it to 1024x1536, send it to extras tab (you'd probably want to inpaint fix stuff at 512x512 mask first though) and upscaled the image by x4 so it's now 4096 x 6144 which is ridiculously detailed and opening it in irfanview (image viewing program) opens it at only 15.5% by default.
You can try adding "--medvram" and/or "--lowvram" to your webui startup arguments.
It will drastically slow down image generation, so you won't want to leave those enabled if you care about speed.
I generate at 512x768 and then I use SD Upscaler in img2img instead of HiresFix. It allows me to resize 4x or more. Be sure to use a low denoise like 0.3-0.4 or it will mess up the picture.
You could also do 2x HiresFix and then img2img upscale.
I can use highres from 1280x720 to 3x latent since i upgrade torch and xformer - i have only 3070 8GB vram
python: 3.10.9 • torch: 2.0.0+cu118 • xformers: 0.0.17rc482 • gradio: 3.16.2