Introduction
You keep seeing all these INCREDIBLE pictures with no descriptions, how do they do that?
Enter img2img with Canny Controlnets. Transform ANY image with astounding accuracy.
This guide assumes you are using AUTOMATIC1111.
What are Controlnets?
Think of control nets like the guide strings on a puppet; they help decide where the puppet (or data) should move. There are several controlnets available for stable diffusion, but this guide is only focusing on the "canny" control net. The official page for canny is available here.
Installing
Install the v1.1 controlnet extension here under the "extensions" tab -> install from URL
if you already have v1 controlnets installed, delete the folder from
stable-diffusion-webui/extensions/<controlnet extension>
Download the
control_v11p_sd15_canny.pth
andcontrol_v11p_sd15_canny.yaml
files here.place the files in
stable-diffusion-webui\models\ControlNet
Reload the UI. After reloading, you should see a section for "controlnets"
Using Canny with Img2Img
Select the "img2img" tab in AUTOMATIC1111
Enter your prompt and negative prompt
Select sampler and number of steps
Put your source image in the img2img field (not the controlnet image field)
Set width & height to same size as input image
Controlnet settings
Enable: Checked
Guess Mode: Checked (only for pre 1.1)
if you are using version 1.1 ControlNet, use the "pixel perfect" checkbox to automatically set annotator resolution and canvas size
Preprocessor: canny
Model: control_canny-fp16
annotator resolution: 768
canvas width/height: same as source image
high and low threshold: Default
these affect how sensitive the annotator is to gradient changes. Usually the default settings are acceptable, but poor lighting may require additional fine tuning. Make changes in increments of 10 on a single slider at a time until you are satisfied with the wireframe.
Higher denoising values will apply the prompt more strongly (only for pre 1.1)
this has been simplified in 1.1 to buttons to emphasize Balanced/Prompt/Controlnet
Generate!
How does this work?
Canny draws outlines around the shapes in the input image. This is connected with the prompt with the controlnet to affect the final diffusion image.
txt2img usage
This technique works similarly in txt2img by putting the image in the controlnet area, but retains less of the original image in the final result. A comparison of the same prompt/seed/sampler/checkpoint in Img2Img or Text2Img is below:
Img2Img:
Text2Img:
Next Steps
Learn how to use mov2mov to put everything together. Check out the guide here!