Sign In

Another way to go about prompting, with a prompt group. Which is works best? You decide.

Another way to go about prompting, with a prompt group. Which is works best?  You decide.

I was introduced to the Embedding Merge Extension for Auto1111, a while ago now. Since reading through the documentation I have used what is talked about to create a 'Prompt Group'. This follows closely the guide discussed in the extension, but the extension does not work across all versions of Stable Diffusion, 1.4 to XL, just SD1.5. This is why I created a method based on it to work across as many versions, models and seeds as possible. Still it is not complete.

The issue is, the Embedding Merge Extension itself only works for SD1.5. At least whenever I have tried to use it. It usually causes error's for anything other than SD1.5 so long as the extension is installed. Yet if you alter the syntax it works out fairly well, across multiple models and versions. If you get it syntax wrong or mixed up you can and will get gobble-de-gook, or an error in running the generation, depending on how complex the prompt is. As the format is close to creating a 'Network' like lora or 'Hypernetwork'. Still I have found this 'prompt grouping' works fairly well across all the Stable Diffusion versions, yet will vary model to model, and seed to seed, as some work better than others.

Here is how it works.

Usually when writing a prompt, whether it is a booru prompt, a sentence, a fractured sentence or any combination prompting styles. You will have group-able or associated words. Common terms for example include:

masterpiece, best quality, hi-res, UHD and 4K.

These words are associated with "Quality", and are often added to give a 'better/'higher quality' result. A common use of this prompting method, looks like:

(masterpiece:1.2), best quality, high quality, 4k, UHD, a photo of a cat.

Using what is learn, and inferred from the discussion and paper for Embedding Merger, the effect may be slight or large, depending on the model and the seed. This prompting format instead looks like for SD models:

a photo of a cat.


a photo of a cat.
AND <('Quality':('Masterpiece'+'Best'+'High'+'4k'+'UHD'):1)>

These prompts produce as 'Similar Result,' it is not exact, the 'AND' can provide a token reduction while providing different yet sometimes better results. Though that is just my opinion. Sometimes it makes things worse, add things, other times it makes the images clearer, to improve the quality.

Where the difference of tokens goes from 20 to 33, then with the 'AND' 27 tokens.Also when comparing the first and second image of the cat, it'd be my that the first image is influenced by a lynx due to the black fur sharp tips on the ears, instead of the rounded ear tips, with domestic cats.

However you can further specify aspects of the prompt, if you have any kind specifics that you would like to high light, ie.

a photo of a cat.
AND <('Quality':('Masterpiece'+'Best'+'High'+'4k'+'UHD'):1.0)>
BREAK <('cat':(('black and white fur')+'licking nose','tongue on nose'):1.0)>

In this version of the prompt, the key word cat should be boosted but, where the BREAK while not necessary, aides to reference the initial usage of the word. If you do not do this, you may get 2 or more cats. Still might depending on the seed and the rest of your prompt.

There are many reasons you may use this kind of specification. Main reason is because you're after a specific trait, ie: the black and white fur, for the cat. Although the tongue makes this rather funny.

- Using the < & > can mistakenly create 'unknown networks', as you are mimicking networks like a lora in your prompt. A key indicator of a mistake in the prompt will be if your token count shows -1/-1, meaning one of your prompt groups is too close to a network.

- Also make sure you keep track of the 'open (' & 'close )', as if in a complex prompt these are out of whack, the image becomes very abstract or produces nothing.

One of method to attempt to keep this in check is to add ( ) or [ ] first, meaning you start with:

1. <>
2. <()>
3. <('':():1.0)>
4. <('keyword':(`add your specifics here`):1*effect value)>

Your choice of ( ) or [ ], depending on how impactful you want the 'prompt group' to be. However if you use the { }, this works with SD1.5 and the Embedding Merger can cause issues elsewhere. You can also fine tune this like any other lora, by increasing and decreasing the :1.0)> value.

Other alterations:

You can swap the + for commas ',' although there are differences that occur between the two. I'd suggest you maintain a consistent use of which ever syntax version you are using for your prompt group in the same prompt. Further more this works rather well with Dynamic Prompting and Wildcards, just remember {key1|key2} if you are after randomized keywords.

It may also depend on the seed. But if you make it overly complex, exceed a specific token limit within the < & > and or make a syntax mistakes, you'll end up with a completely abstract result or blurry messes.

From my own testing I suggest keeping the token limit of a individual <'key'> below 75, where 25 to 30 does seem to be a sweet spot. However if you have multiple <'keys#1'> <'keys#2'> <'keys#3'> etc, there can and will be other weird interactions that lead to the abstract, blurry or ugly results. A 'BREAK or AND', can help alleviate this issues, but not always. Also when using wildcards, your prompt group could easily exceed say 75 tokens, however once the dynamic prompt is run it is often less than 75, so it should be fine.

Happy Prompting.

Bonus: Re-written 2D style prompt :

<('2D':((('hand drawn')+('illustration')+'drawing'+['sketch']+'line art'),(*anything in particular ie artist, etc...*))):1>

<('2D':((('hand drawn')+('illustration')+'drawing'+['sketch']+'line art'),('Artists':('J. Scott Campbell'+'Jim Lee')+'comic'+'Hansel and Gretel'))):1>

Negative prompt:
<(Resolution:(('low'+'poor'+'blurry'+'hazy')+('JPEG artefacts')+(interlaced)):1.2)>,