Disclaimer: I'm not an image generation expert, so feel free to discuss any mistakes in the comments.
Check out my new article about what is popular in Pony Diffusion: https://civitai.com/articles/8886. It will explain more concepts like CFG Scale, Sampler, Steps, and Loras!
I didn’t invent anything; there are more detailed tutorials available than this one. My goal was to write the kind of article I wish I had when I first started generating images - short and filled with visuals and links. So, here it is!
TL;DR:
The quickest way to create cool images for people, like me, without imagination:
1. Find an image on Civitai in the style you like
2. Click remix
3. Open website: https://danbooru.donmai.us/posts
4. Find an image you want to use as inspiration
5. Copy the tags on the left
6. Use those tags in your positive prompt, e.g. "score_9, score_8_up, score_7_up, score_6_up, [copied tags without numbers]"
This is how I was inspired to create my most popular image (It is NSFW, so I moderated it):
Original image: https://danbooru.donmai.us/posts/5886691
BOORU
What are booru tags and why should I care about their quantity? Or how you can understand what words you can use in the prompt.
One of the main questions I had when I first started to generate images - was what words I could put to the prompt and what it would understand.
For example, why can you generate the black mask, but not the green mask?
It may not be very obvious for beginners, but Pony Diffusion V6 XL was trained on images from booru imageboards.
You can just search for them on the internet, but here is what booru I use: https://danbooru.donmai.us/posts?tags=black_mask+&z=5
It has 6.3k images of a black mask, but only 203 images of the green mask.
My rule of thumb is that the tag should have at least 1k images to be recognized. If there are at least 3k, it's almost safe to use it.
Quick and long way to learn booru tags
There are cool articles on civitai that will help you quickly understand what booru tags you can use, e.g.:
https://civitai.com/articles/6349/280-pony-diffusion-xl-recognized-clothing-list-booru-tags-sfw
https://civitai.com/articles/5150/danbooru-tagging-visualization-for-ponyxl-autismmix
But your real friend is this list of tag groups:
https://danbooru.donmai.us/wiki_pages/tag_groups
You can use it to learn what words the model understands, from different clothes to poses and gestures.
I often use it for inspiration or just when I forget the correct wording.
Fast way to convert booru tags to civitai prompts
The full story is in this article:
https://civitai.com/articles/2113/regex-for-quick-conversion-of-booru-tags-to-sd-prompts
The regex didn't work for me (Firefox), so I used ChatGPT to create a new one.
Short story:
1. Copy tags
2. Open URL: https://regex101.com/r/zLcDno/1
3. Paste tags to TEST STRING
4. Copy parsed tags from below
Prompt Engineering Tips
With longer prompts you lose control over details
I've heard opinions that the longer the prompt the better control. But in reality, it's the opposite. The more words, the bigger the chance that it will ignore one of the words.
It's up to you to decide which image you like more, but you can't argue that on the second image, there are no horns.
Models attention is on the first words
The last words influence the image much less than the first.
This is how it works:
I just moved "blue skin, blue oni, facial tattoo" to the top, and I was able to return horns!
More control over priority
Of course, thinking of the word order is not fun, so an easier way to influence the priority is by using special symbols:
(blue oni) - gives +10% priority to "blue oni"
(blue skin:1.2) - gives +20% priority to "blue skin"
But think of this method as a quick hack, not as a main method.
Did you notice the disappearance of facial tattoos? You can play it for a long time. This is why I prefer to apply those methods in this order:
1. Try to make prompt shorter
2. Move important words up
3. Use priority hack
Negative Prompts
Not to improve the quality
Pony Diffusion V6 XL's creator explicitly states this: "The model is designed to not need negative prompts in most cases."
This is disappointing, but asking not to generate six fingers won't work:
Change the style
Negative Prompts still can be used to define your style, so you don't need to extend your positive prompt:
Hide something
You can still use it to finetune an image a little bit by deleting objects you don't want to see:
Negative prompt I use
score_6, score_5, score_4, pony, furry, monochrome, curvy, fat, pubic hair, watermark,
artist name, ugly, ugly face, mutated hands, low res, bad anatomy, bad eyes, blurry face, unfinished, sketch, greyscale, (deformed),
Funny, but I don't follow my own recommendations myself. The reason is - each time I try to delete something from it, I don't like it. I don't like them because of the style, the image still has mutated hands and watermarks