Sign In

Tree-Structured Prompting - A Contrarian Prompting Technique That Works Wonders!

Tree-Structured Prompting - A Contrarian Prompting Technique That Works Wonders!

The very powerful novel prompting technique I'm about to teach you is not my own invention, it was taught to me by my adult son, who is a much better prompter than me! The advantage of this technique over your standard run-of-the-mill prompting process, is that it affords a much more fine-grained control over the picture generation, while almost magically fixing any technical problems in the picture with little to no effort. So let's start!

Step 1: Become An Empty Vessel

First you must unlearn what you have learned! You can't fill a cup that is already full, and what you think you know about prompting will not be of any help in the following. Just let it all go, and make yourself mentally ready to start over from square one as if you never prompted a thing before. But rest assured, the learning process is very fast, and you will master the new technique in just a couple of minutes! It's super simple!

Step 2: Prerequisites

This prompting method works wonders on any model, but it works best on a model that is high quality in the sense that you don't need a lot of crap in the negative to get decent pictures out of it. I'm going to use my own photorealistic model Contra Base 1.0 in this example.

Step 3: Decide what you want in your picture

This step is crucial, because we are doing a tree structured prompt, not a linear prompt as you are used to, and the "root" of the prompt tree should be an overall summary of the general idea of the picture. You can't easily modify the root prompt later, so it pays to get it right before you start adding branches to it.

For this quick example, I'm going to make a picture of a girl leaning her back against a birch tree overgrown with moss.

Step 4: Lock a random seed and start prompting!

We will write a basic description of the scene we want in the positive prompt, and then we will copy the first word of the prompt in the negative prompt, like this:

Positive: "lovely Anna leaning against a birch tree"

Negative: "lovely"

I will explain exactly how and why this seemingly crazy idea works later, for now just assume there's method to the madness and read on!

The first image I got isn't great, but it doesn't have to be. The entire point of this prompting method is to gradually nudge our way to exactly where we want to be, by adding more information with laser-guided precision. So this image is plenty good enough, and I don't need to look for another seed.

Now we will add branches to our root prompt by redefining (or clarifying) its terms! The first term I wish to redefine is "Anna". Who is she? What did I mean when I wrote "Anna"? I decide to define "Anna" as being a "22yo girlfriend". To convey this redefinition to Stable Diffusion, I add the redefinition prompt branch with the help of the BREAK statement (in A1111). Once again I will put the word I wish to redefine in both the positive and negative prompts, for reasons that will be made clear in good time.

This is how the prompts look now:

Positive: lovely Anna leaning against a birch tree BREAK Anna 22yo girlfriend

Negative: lovely BREAK Anna

And this is what happened to the picture. It got way worse, but there's no reason to panic, we'll have it fixed right up in a couple of minutes:

It seems Anna got way too girlfriend-ly with that tree, so I'll clarify what I meant with "girlfriend" next. I'll just be simple and to the point, by "girlfriend" I mean "beautiful young woman with a tree", and this is how our prompts look after adding this sub-branch to our first branch:

Positive: lovely Anna leaning against a birch tree BREAK Anna 22yo girlfriend BREAK girlfriend beautiful young woman with a tree

Negative: lovely BREAK Anna BREAK girlfriend

As you can see, you can add a redefining/clarifying branch to whatever term you feel needs to be clarified, not only to terms in the root prompt, this is what gives the laser-focused precision of this method. Because the branch you add affects the redefined term much stronger than anything else.

With the clarification of "girlfriend", we get this picture:

Hey, she's leaning against the tree now! But I want to make her lean in a more casual way, so I'll clarify what I mean with "leaning against". I'll explain to Stable Diffusion that "leaning against" means "standing in a relaxed resting position". You know the drill by now, and this is the resulting prompts:

Positive: lovely Anna leaning against a birch tree BREAK Anna 22yo girlfriend BREAK girlfriend beautiful young woman with a tree BREAK leaning against standing in a relaxed resting position

Negative: lovely BREAK Anna BREAK girlfriend BREAK leaning against

And so we get this result:

Now her left arm is doing something weird, but no worries, that will sort itself out soon enough, let's focus on the moss instead! I stated in the beginning I wanted a birch tree overgrown with moss. So I'll target the tree, and clarify that "birch tree" actually should have "course ancient moss grown bark".

Positive: lovely Anna leaning against a birch tree BREAK Anna 22yo girlfriend BREAK girlfriend beautiful young woman with a tree BREAK leaning against standing in a relaxed resting position BREAK birch tree course ancient moss grown bark

Negative: lovely BREAK Anna BREAK girlfriend BREAK leaning against BREAK birch tree

And now we get this result:

And with that result I declare success for this quick and dirty example! I got a girl leaning her back against a birch tree overgrown with moss, just as promised!

A few key observations can be made:

  1. The changes to the picture get more and more subtle the more prompt branches you add

  2. You can target a specific part of the picture with a very specific modification

  3. Her hands and face look good, technical problems "fixed themselves" along the way!

Something important to understand, is that this is not a contrived example. I literally picked a random seed, and both the root prompt and all the prompt branches were just me shooting from the hip, with no testing of possible alternatives, I just wrote whatever popped into my mind and kept going, even when things looked like they were going sideways. It took me less than two minutes to create the example pictures for this article (and two hours writing it up!) But in spite of this extremely sloppy attitude, the process still worked!

That's how powerful Tree-Structured Prompting is!

You can of course take it much further. Working with more care and adding more branches can give you very detailed control over your picture. There is a diminishing return when you add more branches, so you should get in the ballpark quickly. But as long as you're not way off the mark to begin with, you can add dozens of branches before running out of steam.

Explanation: Why It Works

You may or may not know that typing the same thing in the positive and negative prompts is mathematically the same as having only a positive prompt, but at a CFG scale of 1.0 , so this means the first word in the positive prompt is radically weakened in strength when the same word is also put in the negative prompt. That's how the normal definition of the term is (almost) removed from having an impact.

But when you then add more words to the positive prompt only, they're now being read by CLIP in the context of the weakened word. Which has a very similar effect to redefining it.

The "tree structure" of the prompt is implicit, created by how terms are redefined.

If you choose to try this method out for yourself you will find that it gives you way more control than you ever had before, and that you don't need to mess around with negative embeddings and typing "bad hands" and stuff like that, you get perfect pictures anyway. You will also notice an improved sense of freedom, it's just a very easy-going way to prompt.

Final Words

Also try out my photorealistic model Contra Base, now updated to 2.0!

8

Comments