Introduction
Hi there. Here are some notes about how I structure my prompts. This guide is the result of personal experiences, so nothing is considered as ground truth, it's just what is working well for me!
I've been largely inspired by TianChimp's one written in Hardcore - Hentai checkpoint. Check my favorite models, they usually work pretty well with my way of prompting.
If you want to have a look on my environment, check my ComfyUI Workflow article. Also, my Scene Composer workflow is a good example on the prompting structure I propose.
I will update this article from time to time. Don't hesitate to take it, tweak, and share your thoughts!
Philosophy
I try to follow the philosophic guidelines when I do my prompts. I recommend to read Danbooru's Howto, which are good base lines.
Tag what you see, not what you know. Don't use tags of stuff that is not visible in the final image. If the character has jeans but the framing is at upper body, don't mention it.
Be explicit. Avoid abstract concepts and prefer visual ones. For example, "writing" is better than "doing homeworks".
Minimal Tagging Criteria. Too much tags cause noise and lead to less precise results. Keep things to the strict minimum, put down only what you absolutely want to see. For example, why put an "nsfw" tag if your scene is already explicit ?
Use common words. You don't have to restricts tags to the Danbooru's list, but I recommend using simple, commons and straight-forwards keywords. You would typically prefer "excited" than "ecstatic". Keep in mind this does not mean using less specific words ("shirt") over more specific ones ("blouse"), that would still be recognized. (thanks to NanashiAnon for the precision)
Structure
I principally structure my prompts into 4 categories: Composition, Action, Subject, Environment. It can easily be remembered via its acronym: CASE.
I personally like to separate those concepts with break lines, it helps me to quickly make adjustments. I recommend having a look at Danbooru tag groups to find what you need.
Keep in mind that it's a guideline. It's ok to mix the order of concepts and tags, especially inside a category.
1. Composition
Quality
Style (photorealistic, color palette, …)
Camera (angle, framings, point of view, …)
Protagonists (1boy, 1girl, animal, object, landscape, …)
2. Action
3. Subject
Body (body type, breast size, skin color, breasts, …)
Face related (eyes color, hairstyle, …)
Attitude (happy, surprised, serious, determined, …)
Expressions (open eyes, open mouth, frowning, …)
4. Environment
Indoors / Outdoors
Light / Time of day (sunny, dawn, dusk, night)
Weather (wind, rain, snow, …)
Objects (furnitures, vehicules, …)
Keep in mind that for certain models, some tags will lead to unexpected results. Typically FLUX, tags like boy
/girl
will output children: you should use man
/woman
.
About prompt's length
The prompt is split by words (or chunks of words) to transform them into numerical representation called tokens. Depending on which models are used and how tokens are normalized, certain parts of your prompt will have more or less attention. My rule of thumb: keep your prompt short and avoid useless repetitions. You'll have more control over it.
Here's some resources if you want to read more on that matter:
Tokens with SDXL and SD15 models – Alen Knight
Token normalization & Weight interpretation – BlenderNeko (Github)
Example
Here is an example following the structure. I usually put some break lines to have a better look on the main areas of the prompt.
score_9, score_8_up, score_7_up, score_6_up, source_anime,
lineart, portrait, from above,
1girl, ginger, solo,
writing, holding pencil,
sitting, heads up,
sweat, freckles, small breasts,
ginger hair, long hair, straight hair,
blue eyes, glowing eyes, glasses, looking at viewer,
smile, smirk, grin, frown,
white tank top,
indoor, library,
sunset, sunny, daylight,
sidelighting, light particles,
desk, chair,
As you can see, the breaklines don't divide perfectly the 4 fields of the structure. Depending on what you want to generate, it can make more sense to visually separate smaller and longer parts. Here, the subject, actions and posture of the body are grouped together: splitting them would create a mess visually more than anything.
Usecases
Here are some usecases and favorite keywords. I try to organise by logical group, from the most general to the more specific. I recommand to cherry-pick the one that make sense for you rather than copy-paste the all line.
You'll also find a reference to where I place it in the structure. I'll use an ID with the format [xx.yy], using the acronym of the category and its sub-category.
Keep in mind that some of them could also go higher or lower in the apparition order (e.g. upside down could be placed as [compos.pov] or [body.posture].
Generic
Here is what I use in almost all my prompt to influence the overall quality of the image. I usually add additional negative keywords if something that I don't like appear, but not before. Again, I like to keep things short and simple.
Following list cover some classic cases, detailing both positive and negative prompts.
Quality (SD1.5)
(absurdres, best quality, masterpiece:1.4),
(worst quality, low quality, lowres, normal quality:1.4)Quality (Pony)
score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up,
score_6, score_5, score_4,Details
detailed, detailed body, detailed faceNegative+
text, logo, watermark, digit, multiple views, monochrome,
bad proportions, anatomical nonsense, bad hands, bad face
Note: Negative+ is an additional list where I cherry-pick according to the situation. In my experience, negative keywords related to anatomic don't always improve the results, so I just keep them aside in case I need it.
Scenes
Actions [compos.style]: motion lines, emphasis lines, speed lines, afterimage, motion blur, bouncing(+hair, breasts, ass, etc), ass ripple, sweatdrop, trembling, shaking, twitching, (spoken) sound effects, sparkling sweat
Lightning [compos.light]: backlighting, dim light, rim light, dusk, sunset, dawn, sunrise, light particles, light rays, sunlight, dappled sunlight, tree shade, crack of light
Hot [body.posture]: sweat, sweating, very sweaty, sweating profusely, sweatdrop, wet (+clothes,hair, etc), sweat clothes, hot, heat stroke, blush, full-face blush, body blush, breath, heavy breathing, steaming body,
Bondage: bondage,shibari,leash,animal collar,cuffs,arms behind head / back,rope,gag
Gangbang [subjects]: multiple boys/girls, group sex, gangbang, orgy, dogpile, love train, threesome,
Note: precise exact number with xboys/xgirls.Tentacles [subjects]: tentacles,penis tentacle,veiny tentacles,tentacle sex, tentacle pit, slimy, levitation, restrained, suspension, lifted
Subject
Multicolored hair [subjects],[hair]: multicolored hair, gradient hair, colored tips, <color-1> hair, <color-2> hair
Yakuza [subjects],[body]: yakuza, tattoo (irezumi, full-body, etc.), piercing (nose, navel, nipple, etc), makeup
Demon [subjects],[body]: demon, demon girl, horns (demon, dragon, curled, cow, etc), tail (demon, raised, etc), colored skin (red, black, etc.)
Goblin [subjects],[body]: goblin, female goblin, green skin, pointy ears, fangs
Furry [subjects]: furry, furry female, furry with [furry/non-furry], <color> fur, fang(s)