Type | |
Stats | 480 |
Reviews | (33) |
Published | Nov 24, 2023 |
Base Model | |
Training | Steps: 22,702 Epochs: 10 |
Usage Tips | Clip Skip: 1 |
Hash | AutoV2 C7795728AE |
USAGE GUIDE:
Negative Prompt (Highly Recommended):
((cartoon, painting, illustration, anime, worst quality, low quality, normal quality))
Sampling method: DPM++ 3M SDE Karras (Untested with different sampler)
width and height: anything in range of 512 ~ 768
Sampling steps: 40 minimum
CFG scale: 4
Hires fix:
upscaler: 4x-UltraSharp
upscale: 2 minimum
denoise strength: 0.35 ~ 0.45
Clip skip: 1
Suggested VAE if necessary: vae-ft-mse-840000-ema-pruned
DISCLAIMER:
This is my first trained model, would love to see the outcome from the community. It's not the best model yet but just uploading for some feedback. Don't expect too much from it. Hands and feet is pretty bad at the moment. This version currently focusing a lot on realism detail and background. Not the subject yet. It's biased to generate NSFW subject, I'll improvise it in the future.
I'm still currently testing the prompt. It's hard to get a good output without proper prompt usage and lack of datasets fed to it. But I'll share some example on the review for public reference.
This is a trained checkpoint, compared to the vibrant model is a merge of model with LoRA. I noticed that it starting to generate bad output after 6-8 LoRAs merged into it. So I decided to just put the dataset directly into the model without merging with LoRA. I'm using Stable Diffusion 1.5 as starting point which is why the quality is pretty bad. Trained with roughly 3000 images as dataset and 5000 images as regularization. I'll add better datasets based on community feedback from time to time. I'm trying to get roughly 10k images for regularizations to reduce the NSFW output for the v2.
CHANGELOG:
Beta v4 (Unreleased)
Trying out different training method (max step training)
Based model fine-tuned v3
Prompt testing:-
v4 3000 steps:
Simple prompt
Positive: 1girl, solo, long hair, brown hair, crop top,
Negative: ((cartoon, painting, illustration, anime, worst quality, low quality, normal quality))
Sampling method: DPM++ 3M SDE Karras
Sampling steps: 30
CFG scale: 5
Result:
Complex prompt
Positive: (photorealistic), (masterpiece), (photography), (realistic skin texture), (professional lighting), alluring woman 90s style, 90s Aesthetics, emo 1girl, black emo hair, long hair with bang, checkerboard flared pants, low waist wide bell pants, closed mouth, stripped red and black shirt, holding, , jewelry, long hair, midriff, (navel black piercing), solo. red checkerboard belt, eyeliner, chains, emo style, fishnet top, spikes,
Negative: ((cartoon, painting, illustration, anime, worst quality, low quality, normal quality))
Sampling method: DPM++ 3M SDE Karras
Sampling steps: 30
CFG scale: 5
Result:
Beta v3
High improvement on hand, feet and reduced anatomical error
Added more dataset and finetuned
Fire and magic particles dataset been added
(EST 3-5 generations to get 1 good output)
Face might need some rework
Complex prompt still causing issues
Beta v2
Tone down NSFW output
Added new dataset and regularization
Slight improvement on hand, feet and overall anatomy
(EST 10 generations to get 1 good output)
Beta v1 (Removed)
Achieve realistic output