Type | |
Stats | 20 0 |
Reviews | (3) |
Published | Apr 20, 2025 |
Base Model | |
Training | Steps: 400,000 |
Usage Tips | Clip Skip: 2 |
Hash | AutoV2 E33852C4AE |
AstolfoXL (WIP)
Probably the first (and the only) individual Full Finetuning with multi-GPU.
LoKR works, but no thanks. 我不做人了,早苗!
Discord: "Good luck".
Specification
Base model: AstolfoMix-XL, version 255c
Tech report: ch06
Training metrics (tensorboard): HF
Dataset (images > latents): danbooru2024, e621_2024
Dataset (tags + captions): meta_lat.json
1 step = 16 images, 4x RTX 3090 24G.
779k steps for 1EP, 8.0 + 4.6 = 12.6M images
Tag + NLP caption with A1111 token trick
Trainer codes: The PR won't be merged
Train parameter: adamW8bit, UNET 1.5e-6, TE 1.2e-5, BS4 (4 GPU) grad accu 4, 71% UNET (Speed + must underfit)
75-100+ days for 1EP. Train 1 EP only. Save per 10k steps.
Core concept: Unsupervised learning
Expectation: MID (100% no filter no quality tag)
How to use
Train LoRA / merge on top of this model. Compatability should still close to 215c. Realistic human content is still supported. "Trust me bro".
Artist tags may not work, but I did trained. Just dump your "NAI" prompts here.
Use TIPO to expand tag based prompts with NLP.
Short tags will suffer from background latent noise. Tags can be observed from E621 or danbooru.
Any observation before 1EP should not being referenced for justification. All images are just seen once.
Full docuementation will be published, which is as long as the AstolfoMix series.