Pony Diffusion V6 XL Source Comparisons
I've been getting some DM's lately asking for my prompts and how I get images to look a certain way with the base Pony Diffusion V6 XL model (my personal favourite).
Pony's dataset was trained on millions of anime, cartoon, furry, and anthro images—some 3D, some not. The reason why I never recommend the base Pony model to beginners is because Pony's dataset is so large and contains so many images of different styles that if you don't control for those images (i.e., filtering through prompts), your image can become unintentionally contaminated by sources you don't want, which drastically alters the style of the image.
That being said, Pony's large dataset can be a HUGE boon if you know how to control what the model creates, as Pony is the best model for generating NSFW poses.
I've written this article to showcase how your prompting can affect the style of the image, and I hope others can use this article as a resource or guide in the future.
Image Comparison
Disclaimer: These images were generated locally with A1111. Because of the way Civitai adds embeddings/textual inversions and other elements behind the scenes, the prompts will get altered and images might look slightly different when generated on Civitai or other online generators.
List of sources to filter:
source_cartoon
source_3d
source_pony
source_anime
source_furry
source_western
source_comic
source_monster
Note, there may be more source tags that I'm unaware of, but I can't find documentation for any beyond the few the Pony creators provided. I did happen to stumble upon some source tags from other users that DID have an effect on the image generation, so I've included those (yellow text).
EDIT: @Venomee mentioned source_monster being a tag used in training and I tested it out after posting this article. I can confirm source_monster does exist in Pony's dataset. However, the influence of source_monster is extremely low and requires increasing the weight to get any discernible effect. This is most likely due to Pony having very few source_monster images in its dataset, which means this tag shouldn't contaminate your images if you aren't generating any monster pictures.
Constants:
Checkpoint:
Pony Diffusion XL
Seed:
Fixed (not -1)
Style LoRA's:
<lora:rokudenashi01c:0.1>, <lora:1dkXLP:0.3>, <lora:ZoroJ:0.35>, <lora:add-detail-xl:2>, <lora:yuatarisf01:1>
Method:
Source tags I wanted to emphasize are typed in the positive prompt and placed after the typical Pony quality tags:
score_9, score_8_up, score_7_up
Source tags I wanted to filter out were placed in the negative prompt.
Image generation settings:
Steps: 25, Sampler: Euler a, Schedule type: Automatic, CFG scale: 7, Size: 1024x1024
The ADetailer extension was used to automatically fix the faces on images after the txt2img process:
ADetailer model: face_yolov8n.pt, ADetailer confidence: 0.3, ADetailer mask only top k largest: 1, ADetailer dilate erode: 4, ADetailer mask blur: 4, ADetailer denoising strength: 0.35, ADetailer inpaint only masked: True, ADetailer inpaint padding: 32, ADetailer version: 24.9.0
No Source Filtering:
An image of a man without any source filtering. You can see that the image is an amalgamation of all styles present in the dataset.
Anime:
Positive:
source_anime
Negative:
source_western, source_comic, source_cartoon, source_3d, source_furry, source_pony
An image of a man in east asian animation/illustration style.
Pony:
Positive:
source_pony
Negative:
source_western, source_3d, source_cartoon, source_anime, source_comic, source_furry
An image of a man in My Little Pony style. Note, the man is still human and doesn't appear to look anthro/pony. However, some of his features have been modified to look more like an MLP pony.
Furry:
Positive:
source_furry
Negative:
source_western, source_3d, source_cartoon, source_anime, source_comic, source_pony
An image of a man in furry/anthro style. Note, the man is still human and doesn't appear to look anthro/furry. However, some of his features now have an animalistic quality.
Comic:
Positive:
source_comic
Negative:
source_western, source_3d, source_cartoon, source_anime, source_furry, source_pony
An image of a man in comic style. source_western is kept in the negative prompt to highlight the influence of source_comic. However, removing it from the negative prompt might shift the image more to the style of american comic illustrations.
Cartoon:
Positive:
source_cartoon
Negative:
source_western, source_3d, source_pony, source_anime, source_comic, source_furry
An image of a man in cartoon style. As with the above, removing source_western from the negative prompt might shift the image more to an american cartoon style.
Western:
Positive:
source_western
Negative:
source_cartoon, source_3d, source_pony, source_anime, source_comic, source_furry
An image of a man in western illustration style.
3D:
Positive:
source_3d
Negative:
source_western, source_comic, source_cartoon, source_anime, source_furry, source_pony
An image of a man in 3D animation style. This is... uh... yeah...
Realism(?):
Negative:
source_western, source_cartoon, source_3d, source_pony, source_anime, source_comic, source_furry, (drawn, furry, illustration, cartoon, anime, comic:1.5), 3d, cgi
An image of a man in a more realistic style. This has all the sources filtered out. Due to the style LoRA's I used, the image is still an illustration. However, we can see the image is leaning towards creating a more realistic output.
Style I use:
Negative:
source_cartoon, source_3d, source_furry
This is the style I personally like generating my images in. I don't emphasize any specific source. I was unaware of the existence of western and comic source tags prior to writing this article, so that's why they aren't filtered out here. I personally like how the Pony source influences the eyes and face, which is why I don't de-emphasize it.
Anime style but without filtering Pony from the dataset:
Positive:
source_anime
Negative:
source_western, source_comic, source_cartoon, source_3d, source_furry
Very cute.