Sign In

Breaking New Grounds - Full Glass of Wine + Diverse Clocks

1
22
0
0
Updated: Mar 20, 2025
conceptwineclock
Verified:
SafeTensor
Type
LoRA
Stats
22
0
Reviews
Published
Mar 20, 2025
Base Model
SDXL 1.0
Training
Epochs: 12
Usage Tips
Clip Skip: 1
Trigger Words
image of a wristwatch
image of an analog wall clock
image of a full wine glass
image of a full glass of wine
Hash
AutoV2
1A9D132A4C
default creator card background decoration
MR_BEEP's Avatar
MR_BEEP

Why?

This LoRA model aims to solve the "strawberry" problem of ML image generation models.

"Oh, your billion dollar algorithm can not solve the full wine glass problem!"

Not anymore.

Open Source is For Everyone

Open source is not bound by huge corporate workflows and processes. It took me 32 minutes to train this model after manually captioning 20 images.

Training Details:

Epochs: 12

Steps: 1920

Optimizer: --optimizer_type=adopt.ADOPT

LR: 8e-5

TE LR: 4e-5

Scheduler: constant_with_warmup 2% (important to warm the refrigerated wine)

Rank: 128/64

Debiased Est. Loss: True

No flipping or caption shuffling because flipping would not work well with clocks (duh).

Problems Encountered During Preparation & Training:

  • It's slightly harder to find full wine glass pictures and clocks showing different time (duh). Thanks reddit for the clocks and instagram for full wine glasses. Some people are animals.

  • Turns out I forgot how to read analog clocks. Some images might have been captioned wrong.

  • The wine and clock biases are harder to fully correct than I expected.

  • OOD generations still underperform. More examples and steps might be needed.