Type | |
Stats | 1,768 |
Reviews | (300) |
Published | May 11, 2023 |
Base Model | |
Training | Steps: 36,000 Epochs: 50 |
Hash | AutoV2 50C97B71F0 |
Hey, i would appreciate if you leave a review <3
V4.2 Update! Compatibility with most anime(and even some non-anime) models!(NAI-trained). Dataset expansion to 720 manually tagged images! Better style control!
I worked hard on it, and here it is! Overall, improved everywhere, except NSFW, dataset is still 100% SFW. Also no more latex-y clothes everywhere, that's been fixed. Hats shouldn't appear nearly as much, though, i didn't have that issue personally. General quality is quite a bit higher, since training was completely switched to Adan with optimized parameters special to ranks im using. Features like separate weight decay for unet and tenc were used, tenc weight clipping too.
v4 features expanded dataset, which led this lora to better generalization, and now it follow composition of original model better, which means you're not losing much while using it. Stability at weight 1 improved too.
Well, don't just believe my words, since they might just be lies for your model xD I have not released my own mix im training on yet, and i don't think it'll happen in near future, so, please, test yourself, and don't just believe my words. v4 should generalized styles better, so if it worked in v3, it should work better in v4.
Hey, it's my first time actually posting something that i trained, though, this is not my first attempt or anything, im doing that for quite a while, just finally deciding to share.
Im not most experienced person in this, and i can't experiment more than i already do, since my hardware is not allowing me to even do Dreambooth.
But you can be certain that i really worked hard on getting to what i train today.
LoRA you'll see and read about today is not my only project, i have multiple others, including NSFW, though, im not sure i'll ever release that. But it is likely that i will release more SFW in future, so i really would love to hear your feedback on using this one.
You can contact me about this LoRA through discord, Anzhc#5269, please, leave message, since im not gonna add someone to friends without knowing who they are.
If you'd like to support my endeavours and experiments with money - there will be a Patreon link at the bottom of text, since i will feel bad posting it at the top, just blatantly asking for money without you even reading about my stuff xD
If you're interested in detail about my training processes, you could join D8ahazard Dreambooth discord channel. im active, and share my findings and parameters there. At the bottom part of text i am providing general description of what i was doing.
Well, to actual description down there!
This is a personal training of style, or, averaged representation of an amalgamation of style concepts, if you will.
It's nothing special(wrong, now it's special, at least to me :D -- v4 update), since i can't do much on my GPU, but it works and i personally love it.
---Couple comparisons before we start wall of text---
P.S. v4 1.0 released, so, please, look at it's images for better representation of capabilities of last version, also - no more latex clothes everywhere.
---Text wall starts here---
S12 does not require trigger words, as it was trained as a fine-tune, but ohwx is there if you want to make it more powerful(mostly not needed).
Main features of that lora is abstraction and abstract power of subjects, which im very bad at describing, but i think you'll understand what i mean with example. Base models are usually already capable of understanding that concept, but my goal was to enchance it and make more tangible.
Style itself in base is anime-based, but with huge representation of 3D-like imagery.
Prompt for this: masterpiece, best quality, 1girl, upper body, powerful, abstract <lora:S12 ohwx Adan_18000:0.75>
negative:(worst quality, low quality:1.4)
Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 3914799726, Size: 512x768, Denoising strength: 0.7, Clip skip: 2, ENSD: 4537, Hires upscale: 1.7, Hires upscaler: Latent (nearest-exact)
It is very simple to use. Preferrable weight of S12 LoRA is ~0.5-0.75, since 1 give too much of an effect, and will make image too dark for my liking, since it was trained with offset noise. In combo with other LoRAs it can be put down to 0.25 even. (And it does combine with some of them quite easily, without introducing visible errors, but i suspect it will not be the case, if other LoRAs are trained with offset noise too, since that'll increase effect)
Note: image below are made with the use of S12, but it is not a solo LoRA in generation and upscale.
(I don't remember prompt, and it was lost in upscaling one, sorry. I was deleting thousands of images cuz no space ;-;)
This image uses LoRA that i called "Effect"(i know im so f'ing original), it is meant to enchance composition a bit. S12 here is at 0.25
Dataset is fairly small, but enough for style purposes, 360 images as of v3.3.2(Will be expanded to ~580 for v4.0, and further for itterations). At the time of version 3.3.2 they were generated on Midjourney v4(one specific sub-style of their, but i don't remember which, probably 4b) with special style prompt, which is called very original, "style12".
I wanted to separate more nice knowledge, like "split theme" for example, since it's not very known by models i use, but dataset representation was not enough to overwrite that token knowledge, so it worse quite poorly and mostly with high weight, if ever.
---Capabilities---
Generally anything, since it's style LoRA. +Special concepts that i already described, + expasion of them in future.
NSFW?
Yes, but it was not trained on NSFW imagery. Dataset is 100% safe for work, so don't expect anything new in that regard and rely on your model knowledge. It will apply style to NSFW, and i quite like it sometimes.
---Example Images---
Since im using custom deep mix that i never posted on the internet, i will be including some images from one of base models of it - ACertainModel, which results you can apply to generally any other anime model, like Anything3. You'll be able to discern those generations by more 2d-ish look, or you won't be able, in that case consider me doing a good job and this LoRA is applicable to most anime models without much concern.
---Training quirks and techniques---
Lower info is valid for versions below v4 0.5
Dataset consists of 360 images generated in MJ, but with special style prompt that does not represent specific style. You can expect overall look to be quite MJ-ish, but... Not that MJ-ish, i guess.
It is trained with Adan D-Adaptation optimizer, which is decently new and experimental, so, expect quite a few rough spots.
Train type is fine-tune, so it will affect whole model for the most part.
Some images were tagged with custom quality tags, different from standard quality tags like "masterpiece" and so on, so they are not affected. They are not represented enough to matter in v3.3.2, so they will not be disclosed.
Techniques like offset noise were used, so it also improves contrast. Preferred weight of lora in that matter is 0.75, as it is too strong at 1.0. Noise offset was at probably ~0.12-0.15.
As it was trained with D-Adaptation, LR is not relevant much, as well as it's not available in main/dev D8 at the time of writing, so i don't know(rip).
Lora ranks i prefer 96/384, but for Adan training i had to lower it, since D-Adapt optimizers are heavy on VRAM. I think it's in range of 48-64/256.
Scheduler used was DEIS im pretty sure.
Since there were bugs that were making training be worse if you generated sample images in middle of training(should be fixed already), i would also specify that it is one-shot, and had no sample generation or saving checkpoints in process.
Note: It is trained on custom deep mix, which is a combo of ACM+AR in step 1, and around 1/3 of AOM2 in step 2, with selective high usage of certain blocks, which turned out to work awesome, and in lots of cases better for my usage. It will work with any other anime model, because of similarities, but best would be described above, specifically AOM2, since style is closer to it.
Images at that point are almost completely manually tagged(pain...), so, lora should be quite responsive to wide arrangement of them, not bleed from concept to concept, and be highly editable.
As i said at the start, here's link to my patreon - https://www.patreon.com/anzhc
It is not centered around AI, but i don't mind sharing my experiments there in WIP state, if that ever would interest patrons.
Just for genuine support, if you want.