zImageBase - V1.0
After quite a few test with different settings, including theory's without any proof, decided to just change all training settings till something works. Did a small test with only 6 images, different styles, photo, illustration etc. After 500 steps it pretty much nailed all images, low style bleeding if any, no flip-flopping of characters changing position, reduced hallucinations, learns faster ( well, it's slow but constant ) and the Lora has more influence even with lower strength ( in some cases you had to crank up the strength over 1.0 which induces artifacts )
This one is just another test with 20 images. Old images with a lot of junk in them, which was wanted to see if it tries to compensate mistakes or learns what is given.
This is a version with lower step count. The versions with higher steps where a bit too close to the source images and made the output very rigged even with different seeds. Even this one is kinda strong.
I don't trust it fully yet. Currently training stuff with a bigger dataset which takes some time.
AI-Toolkit - Prodigy ( non 8bit, at least i think, if i read the optimizer files correctly )
Rank 64 ( haven't tested lower ranks yet, just went with it )
LR: 1 ( Prodigy changes the learning rate dynamically )
noise_scheduler: "flowmatch"
optimizer: "prodigy"
timestep_type: "shift"
content_or_style: "balanced"
optimizer_params:
weight_decay: 0.01 ( still needs testing if needed at all, changed it a few times )
d_coef: 1.0 ( also needs testing, changed it to 1.5 for the bigger dataset for now )
decouple: true
use_bias_correction: false
Take all of this with a big grain of salt ... actually buy a whole salt-mine. With AI it seems no one knows how and why shit works or not. Even the people who made the models don't know what the fuck is going on, all just theory and math. High chance that i'm on the wrong path too, wouldn't be the first time. It did what it's supposed to do for now, so i'll stick with it till problems occur.
V.2.0
Added more pictures and higher step count.
Recommended to use a low strength or you turn everything into a tentacle monster for whatever reason.
Really wish they would put out the Pro version of Flux since this distilled stuff is kinda hard to control and also limited in its flexibility. If you play around with some prompts it always goes to a certain image or something it knows good and is trained on, which results in Flux typical steril look ( cinematic, photo, certain animals or people etc )
I also highly doubt that any of the full checkpoints out there that are trained will ever work. Went trough pretty much every single one of them, but it always comes down to fucked up anatomy or a hardcore bias which you can't negate with negative prompts or prompt weighting since Flux doesn't use this stuff. A picture of a person that is always naked, even if you ask for clothes for example. Doubtful if that even did/does anything in SDXL, never really used that much negative prompts, but it was good enough to get rid of some things you tried to avoid.
Strangely enough, all checkpoints that are trained or LoRAfied with a style ( like anime ) work great, even for realistic images, which i pretty much use for all of my pictures ( not here, basic FP8 checkpoint for showcase reason )... so confusing.
Trained on a dataset which i planned to use for SDXL, but i never got satisfying results . Just a small test with a few images ( basic capitations for now ) and only 800 steps. Will change that later to natural language.
Most used words should be atmospheric, moody, calm, soothing, serene, mysterious ... and assorted ( the images without capitation ... well, that is a capitation, forgot that Kohya picks the name of the folder if no .txt file is present :D )
Order: First 2 Images with LoRA / without LoRA, after that, reversed order
Has more/less impact in specific cases ( for now )
Did quite some testing with Flux LoRA's I've made and got really crazy results. Even after only like 100 - 200 steps it got the concept and lower/higher strength from base 1 has always a huge impact, but something is always left and it feels like you can kinda pick specific parts from the images it was trained on without it using the whole image ( like you only want the the yellow clothing but nothing else )
Made like 30 LoRA's so far ( only for testing purposes ) and what you can do with just a few images is baffling.
Might be just a fluke, who knows.


