Sign In

Just Another Image Generation Guide for Beginners (Pony Diffusion V6 XL and Illustrious)

Just Another Image Generation Guide for Beginners (Pony Diffusion V6 XL and Illustrious)

Disclaimer: I'm not an image generation expert, so feel free to discuss any mistakes in the comments.

  • (Update Feb 12, 2025) – I added some basic info on how to choose the checkpoint. Any updated sections are highlighted in orange.

  • (Update Feb 12, 2025) – Now includes details about the Illustrious model: WAI-NSFW-illustrious-SDXL. This should apply to other Illustrious models as well, but since there are some differences between them, be sure to check their descriptions first. Updated sections are marked in orange.

  • (Update Nov 19, 2024) Check out my new article about what is popular in Pony Diffusion: https://civitai.com/articles/8886. It will explain more concepts like CFG Scale, Sampler, Steps, and Loras!

  • (Created: Nov 4, 2024) – This article originally focused on Pony Diffusion V6 XL (PDXL).

I didn’t invent anything; there are more detailed tutorials available than this one. My goal was to write the kind of article I wish I had when I first started generating images - short and filled with visuals and links. So, here it is!

TL;DR:

The quickest way to create cool images for people, like me, without imagination:

1. Find an image on Civitai in the style you like

2. Click remix

3. Open website: https://danbooru.donmai.us/posts

4. Find an image you want to use as inspiration

5. Copy the tags on the left

6. Use those tags in your positive prompt:

* for PDXL "score_9, score_8_up, score_7_up, score_6_up, [copied tags without numbers]"

* for Illustrious “masterpiece, best quality, amazing quality, [copied tags without numbers]”

This is how I was inspired to create my most popular image (It is NSFW, so I moderated it):

Original image: https://danbooru.donmai.us/posts/5886691

Which model to choose

Model types

If you're just starting out, you only need to focus on two types of models:

Checkpoint – This is the main model that generates your image.

Lora – This modifies the checkpoint, adding specific styles, poses, characters, etc.

Which Checkpoint to choose

Here are some stats on the most popular models (as of Feb 11, 2025):

Click here to see the full image.

Most checkpoints are based on Pony or Illustrious. Even though Civitai sets Flux as the default, I still think Pony or Illustrious is a better choice for beginners. They're also much cheaper than Flux.

Pony VS Illustrious

So, which one should you pick?

Both models are built on Stable Diffusion XL, but Illustrious is newer. It follows prompts more accurately, understands characters and styles better, and barely messes up fingers.

I see Illustrious as an improved version of Pony, but Pony is still more popular, so you'll find more Loras and example images for it.

BOORU

What are booru tags and why should I care about their quantity? Or how you can understand what words you can use in the prompt.

One of the main questions I had when I first started to generate images - was what words I could put to the prompt and what it would understand.

For example, why can you generate the black mask, but not the green mask?

It may not be very obvious for beginners, but Pony Diffusion V6 XL was trained on images from booru imageboards.

You can just search for them on the internet, but here is what booru I use: https://danbooru.donmai.us/posts?tags=black_mask+&z=5

It has 6.3k images of a black mask, but only 203 images of the green mask.

My rule of thumb is that the tag should have at least 1k images to be recognized. If there are at least 3k, it's almost safe to use it.

However, Illustrious models follow tags much better. I’ve read that the new rule of thumb for Illustrious is around 100 images for a tag to be recognized.

For example, Illustrious models actually understand the green mask tag, even though it only has 203 images.

Quick and long way to learn booru tags

There are cool articles on civitai that will help you quickly understand what booru tags you can use, e.g.:

Pony XL:

https://civitai.com/articles/6349/280-pony-diffusion-xl-recognized-clothing-list-booru-tags-sfw

https://civitai.com/articles/7323/pony-realism-vs-danbooru-handsfingers-tags-most-non-sexual-related-wip

https://civitai.com/articles/5150/danbooru-tagging-visualization-for-ponyxl-autismmix

Illustrious:

https://civitai.com/articles/7819/illustrious-xl-v01-visual-dictionary

But your real friend is this list of tag groups:

https://danbooru.donmai.us/wiki_pages/tag_groups

You can use it to learn what words the model understands, from different clothes to poses and gestures.

I often use it for inspiration or just when I forget the correct wording.

Fast way to convert booru tags to civitai prompts

The full story is in this article:

https://civitai.com/articles/2113/regex-for-quick-conversion-of-booru-tags-to-sd-prompts

The regex didn't work for me (Firefox), so I used ChatGPT to create a new one.

Short story:

1. Copy tags

2. Open URL: https://regex101.com/r/zLcDno/1

3. Paste tags to TEST STRING

4. Copy parsed tags from below

Change the style

It might not be obvious, but you can change the style just by using booru tags. This works for both PDXL and Illustrious, but it’s much more effective with Illustrious models.

Click here to see the full image.

If you're looking for inspiration, here are some great resources:

https://civitai.com/articles/9309/artists-for-illustrious-xl - the easiest way to browse styles.

https://civitai.com/articles/8977 - includes links to “Spaces” (doesn’t work in Firefox).

https://huggingface.co/datasets/sieecc/SOKI/blob/main/300Styles-waiNSFW.html - you’ll need to download the HTML file and open it in your browser.

Prompt Engineering Tips

With longer prompts you lose control over details

I've heard opinions that the longer the prompt the better control. But in reality, it's the opposite. The more words, the bigger the chance that it will ignore one of the words.

It's up to you to decide which image you like more, but you can't argue that on the second image, there are no horns.

This issue isn’t as bad with Illustrious models, but I’ve still had cases where it forgot to generate horns, so the problem is definitely still there.

Models attention is on the first words

The last words influence the image much less than the first.

This is how it works:

I just moved "blue skin, blue oni, facial tattoo" to the top, and I was able to return horns!

More control over priority

Of course, thinking of the word order is not fun, so an easier way to influence the priority is by using special symbols:

(blue oni) - gives +10% priority to "blue oni"

(blue skin:1.2) - gives +20% priority to "blue skin"

But think of this method as a quick hack, not as a main method.

Did you notice the disappearance of facial tattoos? You can play it for a long time. This is why I prefer to apply those methods in this order:

1. Try to make prompt shorter

2. Move important words up

3. Use priority hack

Negative Prompts

Not to improve the quality

Pony Diffusion V6 XL's creator explicitly states this: "The model is designed to not need negative prompts in most cases."

This is disappointing, but asking not to generate six fingers won't work:

Change the style

Negative Prompts still can be used to define your style, so you don't need to extend your positive prompt:

Hide something

You can still use it to finetune an image a little bit by deleting objects you don't want to see:

Negative prompt I use (for PDXL)

score_6, score_5, score_4, pony, furry, monochrome, curvy, fat, pubic hair, watermark, 
artist name, ugly, ugly face, mutated hands, low res, bad anatomy, bad eyes, blurry face, unfinished, sketch, greyscale, (deformed), 

Funny, but I don't follow my own recommendations myself. The reason is - each time I try to delete something from it, I don't like it. I don't like them because of the style, the image still has mutated hands and watermarks

209

Comments