Upscaling Using Stable Diffusion

Preface

While working on the topic of upscaling, I had some interesting side effects. For the purposes of this article, I will limit myself to the upscaling of images as it is known from well known web user interfaces like Easy Diffusion or AUTOMATIC1111.

Prerequisites

The script or programme I wrote is simple Python3 code. This programme is using a upscaler model from Huggingface user stabilityai with the name stable-diffusion-x4-upscaler [1].

The required hardware is standard personal computer and a NVIDIA graphic card with a GPU with 8 GB VRAM. Side note: better hardware will allow better results.

Test Image

I prepared a test image for the upscaling. Therefore I created by use of AI and using one of my new LoRAs a vulture in Origami art style.

Figure 1: Photo of a Origami vulture in a resolution of 512 x 512 pixel.

Problems

While testing the approach I had all the time a memory overflow problem on the GPU. This happened from the first time on, when I tried to upscale an image of 512 x 512 pixel.

So I had first to resize the small image to a smaller image. Maximal possible on my installation was a resolution of 355 x 355 pixel.

The installation of the model should work like this:

git-lfs install
git clone https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5

Up to now I was not able to clone this or another model. I had to download all required files file by file.

Python Script

Following Python script is use to realise the upscaling.

from PIL import Image
from diffusers import DiffusionPipeline
import torch

load_path = "vulture_original.jpg"
save_path = "vulture_upscaled.jpg"

image = Image.open(load_path)
image = image.resize((355, 355))

pipe = DiffusionPipeline.from_pretrained("./stable-diffusion-x4-upscaler",
           use_safetensors=True, torch_dtype=torch.float16).to("cuda")

upscaled_image = pipe(prompt="intricate photo, highly detailed, sharp focus",
                     image=image, num_inference_steps=20).images[0]

upscaled_image.save(save_path)

Upscaling

Upscaling using the stable-diffusion-x4-upscaler model results in the following image.

Figure 2: Photo of an upscaled Origami vulture in a resolution of 1408 x 1408 pixel.

With the model used some details are lost and the image is a little bit blurry.

Upscaled Using OpenCV

I upscaled the same test image with OpenCV and deep learning. The result is of better quality and of higher resolution. The explanation how to do this will be presented in another article.

Figure 3: Photo of an upscaled Origami vulture in a resolution of 2048 x 2048 pixel.

This approach is more handy than the first one. Disadvantage up to now is, that this approach needs more time for the upscaling. But, the required resources are less than the one from the stable diffusion approach.

To-Do

Improvement of the Python code. Check, if it is possible to use higher resolutions for the image to be upscaled. Speed up of the OpenCV solution. Try to get a better and sharper quality from both approaches. It has to be checked why cloning of the model did not work.

Conclusion

It could be shown, that it is possible to use Python to upscale images in an easy way. The results are limited by the used model and the capabilities of the existing hardware.

I could also be shown, that it is possible to upscale an image to higher resolution with better quality by using OpenCV.

Final Words

Have a nice day. Have fun. Be inspired!

References

[1] https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler/tree/main