I've often described CLIP (which is the interface between your prompt and the model's interpretation of it -- at least under PDXL/Illustrious which is what I work with) as a 'quirky Bob Ross crossed with a semantic Karen" - and if you've worked with CLIP based models for a while you will understand why.
Based on that mental image I came up with "CLIP-chan", called her 'Karen Ross' .. here's some of her moods/vibes I've been doing of her.
These six were picked out of a pool of ~40 images generated with small prompt variations for poses and expressions.
and like I said in my other CLIP-chan post:
- If your model fights you (especially on wording/intent), that's semantic Karen demanding to see you
- If you're not getting exactly what you wanted/expected, that's Bob Ross having a happy accident
... and really, -take- the happy accidents. Don't try to describe every pixel, trust your model and let it solve the image for you.