Crafting a Hires Tuned Prompt from Scratch

In this article I am going to address the troll under the bridge which everyone may know he's there but don't consider all the implications.

The importance of matching ALL settings when prompting.

This may not come as a shocker to many of you in the community who are seasoned pros. But I hope it will challenge some to re-think how you prompt in the future & to consider your saved prompts in regard to the scheduler used, the model used, #steps, whether it was a highres prompt or a lowres prompt. Other settings like CFG scale, denoising settings, ETC.

For this test, we will be using the simple_kes (version 1.3) scheduler which is not yet released on my github. This particular scheduler is unique in that it can combine several different schedulers and using additional settings, fine tune the results. More on that in coming articles.

The focus of this article is to establish a "base prompt" which you should start from when adding positive prompts to it.

Before we dive into the particulars about building a prompt from scratch designed for hires prompting, let's look at a quick comparison between a prompt built for another scheduler and then see how just swapping schedulers impacts the image. This results should reinforce to you the importance to keeping your prompt building using the same settings. And that prompts built with another scheduler DO impact when using another.

More details on the effects of schedulers while prompting

In a previous test, I created a turtle prompt from start to finish to create a turtle with the Karras scheduler. This turtle uses a different base prompt than what we'll be designing today -- because it is a different scheduler, and how it navigates the stable diffusion space differs.

turtle,

Negative prompt: pencil, clothing:::unzipped, sunglasses!!, framing:::half-body!!, pose:::backward, covered_body!!, {environment:::road, !!}, hand_details:::polished_nails, painted_nails!!,

Steps: 25, Sampler: DPM++ 2M, Schedule type: Karras, CFG scale: 8, Seed: 88120186151, Size: 640x720, Model hash: 86758142da, Model: FusionX-Realistic_v3_float16, Denoising strength: 0.6, Clip skip: 2, Hypertile U-Net: True, Hires upscale: 2, Hires steps: 35, Hires upscaler: R-ESRGAN 4x+ Anime6B, Version: v1.10.1, Hashes: {"model": "86758142da"}

turtle,

Negative prompt: pencil, clothing:::unzipped, sunglasses!!, framing:::half-body!!, pose:::backward, covered_body!!, {environment:::road, !!}, hand_details:::polished_nails, painted_nails!!,

upscale: 2, Hires steps: 35, Hires upscaler: R-ESRGAN 4x+ Anime6B, Version: v1.10.1, Hashes: {"model": "86758142da"}

Prompts the same, Scheduler different. This prompt was built in particular for a different scheduler (Karras) for the same seed, but the final result using a different scheduler greatly impacted the image.

Once you realize you're getting different results with the same prompt but different scheduler, you should re-evalutate. I came to the realization that i needed to design a new stable base prompt to use when using a different scheduler. It's the same process and we start from a 0-0 positive/negative prompt seed.

The Plan...

Our goal is to create hires images with this prompt. And so what we'll do is we'll wait for each hires image to finish, and then we'll pick 1 word at a time to add to our negative prompt, and then we'll wait for the next image to finish, and pick another word.

If you're tempted to design a prompt solely using the low-res images, just keep in mind that it won't translate best when going to a hires image.

This is based on the simple but understated fact: one prompt can change it all.

To highlight this difference I'm going to show you my non-prompted start at low res and high res. In this test, we start at 640x720 and we hires to double at 1280 x1440

Start:

The low res image is a little blurry, especially in the lower right quadrant. The fingers I don't like either. Surprisingly, this seed produced a pretty good image even without negative or positive prompts.

If our goal was to create a good lores prompt, then we would start negative prompting from here. However, we want to create a good polished hires image. So if we keep in mind the core truth: "one prompt can change it all", then we'll know that if we prompt for lores then want to hires it later, then we'll get different results and may need to add additional hires prompting.

Hires Start:

Steps: 25, Sampler: DPM++ 2M, Schedule type: Karras Exponential, CFG scale: 8, Seed: 88120186151, Size: 640x720, Model hash: 86758142da, Model: FusionX-Realistic_v3_float16, Denoising strength: 0.6, Clip skip: 2, Hypertile U-Net: True, Hires upscale: 2, Hires steps: 35, Hires upscaler: R-ESRGAN 4x+ Anime6B, Version: v1.10.1, Hashes: {"model": "86758142da"}

What about this image screams loudest at you?

I wasn't sure how to put it in words because while to me, the worst part of this image are the wierd wonky fingers. And you may be right to try to use a negative prompt to directly address the fingers. This goes into some tricky waters, because if you start negative prompting for specific things like fingers: 6-fingers, 7-fingers etc, you will need to offset this later. And some people do use that as a prompting strategy.

For me, I decided to use this off-image kind of faded result, and called it "ghost"

Let's see the next image at hires with ghost added.

Negative prompt: ghost,

What screams loudest at you here? Or...what don't you want to see in the next image?

For me it was the arms at the bottom that seem to be covering the image, so I tried: "covered_body"

Let's see the result!

Negative prompt: ghost, covered_body,

Ok... from a previous test, I had used this prompt and it didn't change the image when I put it in this order with this hand_details header, so I went ahead and used the complext prompt and added it:

"hand_details:::polished_nails, painted_nails!!,"

If you're unfamiliar with this prompt above, it's because I'm using a prompt_parser that I designed which uses both conjoining underscores and sequences with parent-child sequences denoted by using ":::" with the end of the sequence ending in "!!". The prompt_parser will strip out the ":::" and the "!!" and assign the words inside the sequence as a parent-child relationship, with all the words inside the sequence in this case, "polished_nails, painted_nails" to be a relationship related to the word, "hand_details" which is also a combined two-word prompt. It's a little complicated, and if you want to learn more, see this article which explains a little more and shows you where to get it.

Moving on.....this is the result:

Negative prompt: ghost, covered_body, hand_details:::polished_nails, painted_nails!!,

Here we decided to group things together, much like we did with hand_details, we're going to also group covered_body into a new parent-child grouping: "pose:::covered_body!!,"

Even without a positive prompt to shape where we're going, we're going to continue this trend to see if we can improve upon it.

Now, it may be difficult to see if this is a good ending point for your prompt. And it could be. But I decided to address the blurry background to see how adding that one word to the negative prompt might change the next image.

The next result:

Negative prompt: ghost, pose:::covered_body!!, hand_details:::polished_nails, painted_nails!!, blurry,

Now from here, I'm going to get particular and try to see if I can address the object being held. I don't know if it's a phone, a tablet, a book - it's unclear. But I decided to prompt it as a phone, in particular, I used "iphone" as the negative prompt:

The last result before adding a positive prompt:

Negative prompt: ghost, pose:::covered_body!!, hand_details:::polished_nails, painted_nails!!, blurry, iphone,

What you'll notice is that adding just one word, in this case, "iphone" did not completely change the image. It just removed the object being held. When you get to this point where your negative prompts don't change the entire picture, but particular objects, you're ready to go to the next phase.

We have successfully navigated through a host of different images where adding a single word or prompt conjuction changed the image. With the last image, it only changed 1 part but kept the overall composition the same.

At least with this seed, without trying to change the background or subject (woman) further, I decided to stop here and use this as the "base prompt".

So what do we want to draw?

Let's see how it performs when we add "turtle":

turtle,

Negative prompt: ghost, pose:::covered_body!!, hand_details:::polished_nails, painted_nails!!, blurry, iphone,

Not bad!!

Not the best -- but not bad for the first prompt without negatively prompting the turtle bits that we don't like.

From here, with the base prompt without "turtle" built, we'll now add 1 thing that strikes you about the image that you DON'T want, without adding any other positive prompts.

Our goal is to take the positive prompt, and now create a base prompt layer on top of our base prompt that we designed earlier.

So now we'll just add them one at a time, and I'll let the images speak for themselves, followed by a summary.

turtle,

Negative prompt: ghost, pose:::covered_body!!, hand_details:::polished_nails, painted_nails!!, blurry, iphone, extra_legs,

turtle,

Negative prompt: ghost, pose:::covered_body!!, hand_details:::polished_nails, painted_nails!!, blurry, iphone, extra_legs, underwater,

turtle,

Negative prompt: ghost, pose:::covered_body!!, hand_details:::polished_nails, painted_nails!!, blurry, iphone, extra_legs, underwater, passenger,

Quick pause here. Adding the word, "passenger" had the unintended result of moving the object on top of the shell into a shared turtle. So I rolled back the changes for the next prompt.

turtle,

Negative prompt: ghost, pose:::covered_body!!, hand_details:::polished_nails, painted_nails!!, blurry, iphone, extra_legs, underwater, extra_turtle,

turtle,

Negative prompt: ghost, pose:::covered_body!!, hand_details:::polished_nails, painted_nails!!, blurry, iphone, extra_legs, underwater, extra_turtle, object_ontop_shell,

turtle,

Negative prompt: ghost, pose:::covered_body!!, hand_details:::polished_nails, painted_nails!!, blurry, iphone, extra_legs, underwater, extra_turtle, object_ontop_shell, extra_tail,

And we're almost there. One thing I don't like is the claws. Almost crab like or looks a little like a flipper.

turtle,

Negative prompt: ghost, pose:::covered_body!!, hand_details:::polished_nails, painted_nails!!, blurry, iphone, extra_legs, underwater, extra_turtle, object_ontop_shell, extra_tail, flipper,

And this is the final result where I stopped after using negative prompting. I feel that this is a pretty good looking turtle.

From here you could add more positive prompts, then tune each new word with new negatives.

This turtle is way different than the Karras, and using the Karras scheduler with these prompts will not produce the same image, and more than likely it will be a bad image of a turtle. Ready to check to confirm again?

We should get a similar looking turtle, but how similar?

A two-legged turtle and a little cute turtle that looks like a cross between a turtle and a crab. Some weird coiled tail at the back leg. Nope -- not a good result. Granted - the realisticness of the image are pretty good, but the image itself not so much.

It's so important to design your prompts using the same settings - - even the sampler or the scheduler plays a role in the prompt. Going forward, I hope you'll keep this in mind when using other prompts you see online, or in your own prompt crafting!

Crafting a Hires Tuned Prompt from Scratch

The Plan...

Hires Start:

What screams loudest at you here? Or...what don't you want to see in the next image?

Moving on.....this is the result:

The next result:

The last result before adding a positive prompt:

So now we'll just add them one at a time, and I'll let the images speak for themselves, followed by a summary.

Comments