Sign In

Making Images Great Again!

Making Images Great Again!

(For the best viewing experience please enable Dark mode in your profile dropdown.) (Includes video representations.)





Today, I'll guide you through a step-by-step tutorial on effortlessly rendering high-quality images.


First, you're going to have to choose a base model. I'll be using the HassakuHentaiModel (may generate nsfw images) to render my images, but what you choose shouldn't matter apart from the Sampling method.


https://imgur.com/a/qugsVGE - GIF -


Now let's choose our Sampling method and the amount of Sampling Steps.


For my model, Sampling method DPM++ 2M Karras seems to give the best results. (this setting will vary depending on your model of choice.)


I'll also be setting the Sampling steps to 35. (Higher numbers may give better results and are more likely to follow your prompt up to a certain point.)


https://imgur.com/a/DRRZa2x - GIF -


Now let's choose our starting prompts.

  • Positive: (8k uhd, masterpiece, best quality, high quality, absurdres, ultra-detailed), 1girl

  • Negative: easynegative, badhandv4, (worst quality, low quality, normal quality), bad-artist, blurry, ugly, ((bad anatomy)),((bad hands)),((bad proportions)),((duplicate limbs)),((fused limbs)),((interlocking fingers)),((poorly drawn face))


For most anime-style models, I like to use the negative prompt above as my starting point. However, if my model is more realistic, I'll use: badhandv4, paintings, sketches, (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, age spot, manboobs, double navel, muted arms, fused arms, analog, analog effects, bad architecture, watermark, (mole:1.5), EasyNegative


These basic quality tags help you get well rounded results whilst saving time. But They are like I said, restrictions and confinements. to visualize this, take a look at the image below.

you're effectively being put into a box limiting your freedom. I HIGHLY RECCOMEND you apply selective prompting. 👇


(TO LEARN MORE ON PROMPTING, REFER TO THE SECTION - SCULTPING -)


I have all of these added as styles to speed up workflow.


https://imgur.com/ncP9lfn - GIF -


Now let's Generate our image and see what we get.

Not too great, But we aren't done yet.


Let's add some more tags to our positive prompt.

(8k uhd, masterpiece, best quality, high quality, absurdres, ultra-detailed, detailed background), 1girl, looking at viewer, portrait, black hair, brown eyes, head tilt, upper body, city, jacket, complex, dramatic lighting, rim lighting, night, night sky


Seed-(1968911923)

Getting better, but still not quite what were looking for.


Let's take this into inpaint and fix any imperfections.

https://imgur.com/a/ORRwD2e - GIF -


Now we have to tell inpaint what we want to change, and what to change it to.

Let's change our positive prompt to: (8k uhd, masterpiece, best quality, high quality, absurdres, ultra-detailed, detailed background), 1girl, looking at viewer, black hair, brown eyes, dramatic lighting, head tilt


Let's also change the Sampling steps to something higher like 50 and the inpaint area to Masked only for the best results.


Set the brush size to the maximum size by holding CTRL and scrolling up on your scroll wheel, or by clicking on the pencil icon and dragging the slider to the right.


I'll be setting the Denoising strength to 0.6. The lower the number (ranging from 0 to 1), the less it's going to deviate from the original image.



Now all that's left is painting the mask over her face.

https://imgur.com/a/8M8qnsq - GIF -


Seed-(1484380036)


Now this is better. You can continue to inpaint by dragging the generated image back to the inpainting canvas, masking a new area and providing a new prompt.


https://imgur.com/a/oNOJgoB - GIF -

But for simplicity and so that you can follow along easier, we can stop at just the face.


Final rendering process

Drag your generated image onto the canvas again but this time click on img2img.


https://imgur.com/a/ndFIzJv - GIF -


Now, you have two options: you can either drag the original prompt from txt2img into img2img, or if you want specific changes done to the final render, you can edit the prompt inside img2img.


I'm going to take the original positive prompt but alter it slightly to recognize the railing behind the girl.


Remember that the further away from the start you place a tag, the less of an impact it will have when generating an image.


Positive: (8k uhd, masterpiece, best quality, high quality, absurdres, ultra-detailed, detailed background), 1girl, looking at viewer, portrait, black hair, brown eyes, head tilt, upper body, city, jacket, complex, dramatic lighting, rim lighting, night, night sky, railing


Now, you're going to want to set your Sampling steps to something slightly higher; in this case, I'm setting it to 60, while optionally keeping the Sampling method the same. Additionally, click on 'Resize by' and set it to 2. Moving on to the Denoising strength, if you want the output to closely resemble the original image, opt for a lower number. However, since we didn't perform extensive inpainting, I'm going to set it slightly higher, around 0.6, in hopes that it will help fix those vague areas for us.


Note: If you have limited VRAM (4-6gb), it may be better to set the size to 1.5 or 1.75.

Larger resolutions may not always work on cards with low VRAM.

If you want a larger image but can't generate one inside img2img, refer to the 'Upscaling your image' section in the table of contents.



Now let's generate our final image.


Seed-(1611752004)

Much better, this is still far from being perfect, but you can spend as much time as you need in the txt2img and inpainting stages to get a better result.


PostProccessing


You can use Photopea after you render to enhance image visuals.


Tips and Tricks -


Each time you place parentheses () around your prompt, its priority increases by 10%.

(High Quality) = 110%

((High Quality)) = 120%

(((High Quality))) = 130%

The opposite can be achieved by placing square brackets [] around your prompt instead.


By placing one pair of parentheses and a colon followed by a number (e.g., High Quality:1.0), you achieve the same effect.

(High Quality:1.1) = 110%

(High Quality:1.2) = 120%

(High Quality:1.3) = 130%


By highlighting text and holding the control key, if you press either the up or down arrow keys, it will increase or decrease the priority by 10%.


https://imgur.com/a/E4iiVRU - GIF -


If possible, the best way to fix a problem is to do it manually. I have prior experience in digital art, which allows me to fix problems that could not be fixed otherwise using Stable Diffusion.

You can also put the image into Photopea and half-fix it, then put that image back into Stable Diffusion to kind of guide it a bit. It doesn't have to be perfect.



Upscaling your image


Upscaling images from a file. (continue here)

First thing you are going to want to do is click on the Extras tab.

https://imgur.com/a/G5Hy6AK - GIF -

Now drag your image file onto the canvas.

https://imgur.com/a/Ha9QE41 - GIF -

Then for now just select R-ESRGAN 4x+, and set the Resize to 2.

After you hit generate, Stable Diffusion will quickly upscale your image, making it ready to be used for anything.


Upscaling an image after generation. (continue here)

https://imgur.com/a/z2FAF8O - GIF -

Once you've sent your image to Extras, follow the steps above for settings.


- SCULPTING -

Let's start simple, using just the positive prompt.

CFG: 7
Width: 408
Height: 512
Seed: 12345
Pos: 1girl eating a sandwich, table, restaurant, portrait, 
Neg: N/A
Steps: 35
Sampling Method: DPM++ 2M Karras

Model: hassakuHentaiModel_v13



Imagine generating images similar to sculpting: start with a basic shape and then gradually refine it by chiseling away, until you've crafted the final piece you envisioned.


Using a Basic-HighQuality prompt such as (masterpiece, best quality, high quality, highres, ultra-detailed) in this context would be akin to using a sledgehammer to carve with.

Quality tags can frequently enhance the resulting image, yet they possess the potential to confine you within a restrictive artistic enclosure.

The same goes with Basic-NegativePrompts.



Textual inversions like badhandv4 can also narrow down your options and limit your creativity. It may fix some hands but if you want to stay true to your base image or to a certain style/look, then using Textual Inversions may not be a good option.


If you want to carve fine details, you have to chisel off one piece at a time.



Pos: 1girl (eating a sandwich:1.3), table, restaurant, portrait, (looking at viewer)
Neg: N/A

To get a useable result, I used the original image inside of ControlNet set as a reference whilst changing the prompt to keep the style but also change the image inside of txt2img.



Now I'll inpaint a few times to fix any major mistakes.



Then I'll put this through im2img at a higher resolution using a low Denoising strength.



Now we can run this through inpaint a few more times to fix any more mistakes.

I won't be going in too depth with the inpainting stages, as the process is explained above.


When inpainting the face, I used a low Denoising strength and a reference of the original image inside of ControlNet to try and stay true to the original face whilst trying to refine it.


We will be trying to avoid changing the style of the image as much as possible as we go on.



inpainting images that start off with only a few prompts can be difficult, especially if you intend on keeping the original look and style. (observable more in stylized models)


This is where I'll conclude. While you could invest more time inpainting the table and hands, what's presented here should suffice to convey the intended message


Navigating prompts can pose challenges, as they might effectively restrict you from certain images. However, by sculpting your image and employing selective prompts, you could attain results otherwise unattainable.


The final image was not processed through img2img.



For this particular image, I only needed a few prompts, some images may require many, many prompts to get looking how you envisioned. Just remember that if it ain't broke, don't fix it.


Basic Settings Explained -

CFG: CFG or classifier free guidance controls how closely the image will follow your prompt.

Sampling Steps: sampling steps dictate the amount of iterations, transforming noise into a recognizable image from the text prompt; more steps enhance detail but extend processing time.

Sampling Method: The sampling method determines how noise is turned into a latent image. More information here: Stable Diffusion Samplers: A Comprehensive Guide - Stable Diffusion Art

Seed: The seed is an initial configuration or value that influences the randomness and reproducibility of the model's behavior during generation.

Denoising Strength: Denoising strength governs noise levels: 0 means no addition, 1 adds maximum noise for a completely random tensor.


ControlNet -

What is ControlNet? ControlNet is a neural network framework that enhances diffusion models by adding additional conditions, improving Stable Diffusion for conditional text-to-image generation with inputs like scribbles, edge maps, and more.

In simpler terms, ControlNet enables you to include additional conditions while generating your image.


Getting Started

This is ControlNet.


Here we have our models.


You'll only need to know a few. Here's a list of the models you'll probably be using the most.


Input Image:


Canny:

Depth:

OpenPose:

Scribble:

Reference:

SoftEdge:


Now, drag your image onto the ControlNet canvas.


Assuming you have ControlNet models, select the one you want, then -


If you don't have any models, please refer to the section titled 'ControlNet Installation Guide' -


https://imgur.com/a/GLZgTRU - GIF -


Okay, now that you've got your input image processed, you can go about generating your image like normal. We won't need to worry about the other setting as of now.


If your preprocessed image is cropped after you generate, set your Resize Mode to "Resize and Fill".


Model: revAnimated_v122
Steps: 35
Seed: 3270086737
CFG: 7
Width: 384
Height: 512
Sampling Method: DPM++ 2M Karras
Positive: (8k uhd, masterpiece, best quality, high quality, absurdres, ultra-detailed), 1girl
Negative: easynegative, badhandv4, (worst quality, low quality, normal quality), bad-artist, blurry, ugly, ((bad anatomy)),((bad hands)),((bad proportions)),((duplicate limbs)),((fused limbs)),((interlocking fingers)),((poorly drawn face))

easynegative and badhandv4 are Textual Inversions, you can not replicate this without them.




Each ControlNet model will do different things. It is best if you tinker with them yourself.


OpenPose - Superseded by (dw pose)

This will be on how to convert an image into an armature, I will not be touching on how to create poses from scratch in this section.


First, drag and drop your image into ControlNet with the pose you want.

Then select the OpenPose model.


Then process the image.

Now if we generate, we get this:


We're getting there. Now if we use the workflow at the start at the article, we get this:

This is much better, if you spend more time inpainting you can get even better results.


ControlNet Settings:


Preprocessor: Many models offer various preprocessors. For optimal results, experiment with each model to determine which one yields the best outcome.

LowVRAM: When set to true, optimizes VRAM usage while using ControlNet.

PixelPerfect: Pixel Perfect aligns the annotator for accurate input/output matching, avoiding displacement.

Allow Preview: Shows the preprocessed image.

Preview as Input: I could not find information on this setting.

Preprocessor Resolution: This determines the resolution of the output image.

Control Weight: This determines how heavily your image will be affected by ControlNet
Starting Control Step: Scaling from 0 to 1, this dictates the Start point of ControlNet.

Ending Control Step: Scaling from 0 to 1, this dictates the End point of ControlNet.

Model: Allows you to choose your model.

Control Mode: Changing this setting will decide the weight of ControlNet.

Resize Mode: This will determine the aspect ratio and resolution of the output image.

(Other settings vary with model)



Hands.

Nailing down the intricacies of generating hands with can be quite challenging, but I have a solution in mind.


Let's generate an image inside txt2img.

Sampling Steps: 50
Sampling Method: DPM +2M Karras
Width: 408
Height: 512
CFG: 7
Seed: 514475269

Positive Prompt: (masterpiece, best quality, high quality, highres, ultra-detailed, ((detailed background))), 1girl, looking at viewer, ((peace sign)), portrait, pink hair, (freckles:0.75), blush, happy, smile, medium hair,

Negative Prompt: badhandv4, easynegative, (worst quality, low quality, normal quality), bad-artist, blurry, ugly, ((bad anatomy)),((bad hands)),((bad proportions)),((duplicate limbs)),((fused limbs)),((interlocking fingers)),((poorly drawn face))


Here's our base image. As you can see, this hand is not that accurate.

(Follow along using Photopea)


Here's how we'll start. Obtain a reference image of a hand in a similar pose, either from the internet or by taking a photo of your own hand for greater accuracy. You may also consider using 3D models.

I'll be using a photo of my own hand. with the background removed.

Now place this over your character's hand.


And after a bit of color grading and resizing we get this.

(Make sure you hide the hand behind your replacement.)


Now let's place this into img2img and get it to match a bit better.



I used a low denoising strength of 0.25 so that SD would just slightly change the hand to match the rest of the image.


Then you can alter it slightly in inpaint, and upscale inside again img2img with a higher denoising strength.

denoising strength: 0.5


When I found a seed that was good but just not quite there, I would turn on extra and give it a range of 0.1.


It isn't a perfect solution, but it gets pretty close in most situations. I used this same method in my newest post here.


https://civitai.com/posts/611100 (NSFW)



Another example of this method:



It may take a couple times to get the hand looking how you want. For this demo I generated roughly 100 images, But it definitely pays off in the end.


This article will NOT be receiving any more updates for a while.

(Due to the loss of my main PC, I am currently unable to use Automatic1111.


Happy Generating!

52

Comments