Type | |
Stats | 1,304 |
Reviews | (138) |
Published | May 2, 2023 |
Base Model | |
Training | Steps: 25,000 Epochs: 1,041 |
Trigger Words | 8hit8 |
Hash | AutoV2 7D3CCAA11D |
Background:
This Textual Inversion embedding bears a striking resemblance to a famous JAV Actress with world-renowned assets.
Trained using base SD1.5 on 50 high quality 512x512 cropped images of the subject "modeling" with text removed from the images. The embedding is 16 tokens--that seems excessive, but it's the only way I got good results.
I'm very new to this, and I've found that training is, unfortunately, an art, not a science. So, this embedding isn't perfect and has some limitations. Please keep that in mind when you rate!
Note: Remove the ".zip" from the end of the file name and place in the embeddings folder. The default trigger is 1hit1.
Important Points:
For whatever reason, the embedding works worse for realistic generations and on the base SD1.5 model. Other models and artistic styles produce better results.
The results get much better with prompt engineering and, especially, lengthy prompts.
You definitely have to adjust the weight of the trigger word relative to the rest of the prompt.
Generations of the subject's assets can be prone to distortions, especially when they are exposed. This doesn't occur often, but does happen. I believe this was due to the fact that the extreme size of those assets made it impossible to fit both them and the subject's face in a close-up when cropping to a square image. This seems to be mitigated by using words in the prompt to describe those assets.
You will need to include words like naked, nude, and topless in the negative prompt to avoid accidental wardrobe malfunctions (we wouldn't want that, now would we?).
I've found that including open mouth and teeth in the negative prompt improves generations.
Unless you include strong and specific descriptions of your intended background scenery or setting in your prompts, generations will tend to incorporate the following elements: palm trees, desert plants, brick walls, distant buildings, sunny weather, and the interior of houses. Prompting for background scenery and setting elements seems to mitigate this effect.
For further tips & tricks, see the PNG info in the sample images. Yes, my prompting style is weird and complicated, but it works, right?
Samples:
These sample generations were done in Galaxy Time Machine Photo for You and Deliberate. The crappier looking ones were made using simple prompts; the better looking ones required extensive prompt engineering. I didn't use HiRes Fix, Face Restoration, ControlNet, Img2Img, Lycoris, or Negative Embeddings on any of these generations. I believe I used model LORAs on a few, but none for concepts or subjects. Your results will probably improve if you use any of those!
Note: The new religious image I uploaded uses a lot of tricks. Oh, also, the one demon picture uses a ton of stuff too. Just showing what it's possible to do with the embedding and some extreme engineering.
PREVIEW
I've been working on Version 2 and had a breakthrough--it was leaps and bounds better than Version 1... but, after playing around with it more, there were some inconsistencies with generation. When it worked, it worked much better, but it was way more difficult to prompt for. Basically, unless you prompted for specific "features" of the subject, they wouldn't show up well.
I'm going to try fixing what I think the issue was (likely captioning) and try training again sometime soon. My hope is that I can produce an embedding that gets results as good as the one below with simpler prompts and more consistency.