Sign In

Sniffing / smelling (own) armpit - Pony

103
633
6.6k
54
Updated: Jun 30, 2024
conceptsmellingarmpit
Verified:
SafeTensor
Type
LoRA
Stats
633
6,572
Reviews
Published
Jun 30, 2024
Base Model
Pony
Training
Steps: 13,440
Epochs: 24
Usage Tips
Strength: 0.9
Trigger Words
sniffing armpit
Hash
AutoV2
6AA286D7B0

This LoRA allows depicting a person smelling / sniffing their own armpit.

Surprisingly Pony doesn't already know this concept (at least with tags I tried), so I decided on creating a LoRA for it.

Main trigger: sniffing armpit

Additional tags (ordered by tag frequency): exposed armpit, clothed armpit, arm lowered

(the last one had so few images it is really hit and miss; it is better to also have exposed armpit in the negative for clothed armpit, as the term armpit already makes pony quite happy to generate ... well armpits)

Suggested LoRA weight: Depending on the style you want 0.4 – 1.0.

And the explanation of pony being so bad with it is probably that there are barely any images tagged accordingly on different boorus.

And with that we can talk a bit about the

Training

Specifically, I collected 36 samples from different boorus (not really cherry picking, they were all good ones I could find).

I then generated 170 additional images with Pony Diffusion using Control Net (mixture of Depth and Pose models). For that I used random art styles, gender, ... .

Input images for the Control Net were both drawings and real photographs (unbelievably many stock photos of this exist). They were sourced from conventional image search and 19 of them were added to the training images.

This resulted in 225 training images.

All of them were then tagged using wd-swinv2-tagger-v3 by SmilingWolf and afterwards the 4 tags listed above were manually added.

Afterwards I added masks to the images using first RemBG (Human) and then ClipSeg for the text Arm, Armpit, Face. As the dataset was small and a quick glance showed that not all masks were correct I also did a quick manual pass correcting masks.

I then trained the LoRA using OneTrainer.

The relevant training parameters were:

  • Prodigy optimizer

  • 24 epochs @ 560 steps

  • 10 image repetitions (with image and caption variations)

  • Batch size 4

  • Using image masks, with unmasked probability of 0.03, unmasked weight 0.02

  • 1024 resolution with aspect bucketing

  • LoRA rank 48, alpha 2 (later resized to target 32, with sv_fro 0.99)

Training was done for about 8 hours on a RTX 4090.

If you have any additional questions feel free to ask.