Type | |
Stats | 52 0 |
Reviews | (10) |
Published | Nov 7, 2024 |
Base Model | |
Training | Steps: 2,924 Epochs: 34 |
Usage Tips | Strength: 1 |
Trigger Words | Aric |
Hash | AutoV2 125B1B3809 |
Aric is a ChaosMen model adult performer. The site describes him: Aric is literally a lumberjack. He has a tough living, cutting down trees and chopping wood. Hence the beat-up shins and very dark tanned face and lower arms. But he is a full-on Ginger [...]
Please be responsible, this is based on the resemblance of a real person, even if he was an "adult performer" you NEED to follow Civitai rules when posting. But please do test it! I’ll be glad to see some results.
Click each model version to compare the preview images!
Oh wow, Aric got me into so much work...
First he was my first attempt to full-finetune a model, in this case, with a character concept. I will publish this full model here just for reference. It works specially well for bleed-control but it was not that great for resemblance and terrible for LoRa extraction. For a good lora, it would need a 640 rank and that is a 6GB lora... it would make no sense. So I went back to direct LoRA training (with some more experimentation that I'll go about it below).
My second objective was tattoos. He has 6 different tattoos on his body and I jump into this opportunity to see how well I could teach the model accurate character's tattoos since there are some other people I really want to do in the future with a bunch of tattoos and to get the perfect resemblance, without those tattoos that would be a disaster... anyway, back to Aric tattoos it was kind of a failure...
This was all a lot of work. I'll try to explain all my process here, much to my own understanding and record. So read it if you want.
Initial Dataset preparation (A):
I decided first I didn't want to middle too much with video screencap, so I only used both his photoshoots. Which now I think it was a mistake, there are good frames from the videos with high quality and good close-up of his face and tattoos that I'm pretty sure could have been great for learning, specially after block analysis or direct block targeting training (to compensate for the jpeg artifacts that come with screencaps).
So first I had to remove all the "chaosmen" watermark to not interfere with learning. I used a very effective workflow with SD1.5 model to inpaint that I might publish as well in the future.
I captioned all his images with joy caption v2 and referenced Noah (the other male in his non-solo photoshoot) as "Bob", just to have a different enough name so the LoRA could learn in more simple terms. I should have just gone with Noah, but I forgot to replace it. Anyway...
I first extracted all his tattoos using photoshop the best I could. Applied a bunch of corrections of colors, levels, light, distortions, etc. They were REALLY low res. Like 128x128 or less, almost pixelated. Then I used Flux to upscale it. This is insane work and I was bummed it did not pay out. Here are some examples that went into the dataset:
I also tried to use the circle zoom on the image to see how well this could teach the model... It did learn the circle idea, but not the tattoos.
So Initially I used all of this:
1 folder with all Aric photos + Bob interactions
1 folder with tattoos extracted on white background + tattoos inserted on the pictures with arrows and a circle zoom + some close-up on his own body tattoos.
Later dataset adjustments (B):
After my experiments describe below, I went back to the dataset and removed repetitions, removed OR cropped ALL Bob images where Bob (Noah) face was clearly seen.
I also dropped the tattoos' folder completely. I wanted to get face resemblance with this dataset.
THIS was a big game changing for my lora. So if anyone is reading this, my best advice is, keep the focus on your character. Don't try to include another person just to control bleed, it's not worth it.
I dropped regularization images. I'm pretty sure that they were dragging down the training. I think reg can be really useful to control bleed and preserve the original model, but not at 100%. AiToolkit have an option (reg_weight: 0.25) to lower it. For my sweaty shirt lora, 0.25 worked great. But with Kohya I did not find this option. SO I recommend dropping reg for kohya, really.
1- Full Fine-tune
So this was my first attempt. I trained with Kohya, and it did result in great quality. I think it reached out great body resemblance and quite ok face resemblance, but not perfect, Like 75%. Now, comparing to the later experiments, this seems almost bad. But the tattoos locations were greatly learned (but not the tattoos itself) and it also learned some NSFW concepts of the nudity on the dataset in a way more expressive way than the LoRas.
After a while, it started to learn "bad quality" "blur" and jpeg artifacts. I have an earlier version (version 3 epoch 2) that was better in quality but worse in resemblance.
The published one is the last epoch of v4. It has some quality issues.
2- Extracted LoRa
SO, this did not pan out. Extracting with lower rank gave me a completely different person. Increasing the weight of the LoRA helps, but also not good enough. Some extracted loras needed 2.5 weight to get to SOME Aric resemblance.
A 640 rank lora works quite good, with just a bump in weight. But it's like 6GB a lora. Not worth it in my opinion.
I won't publish this as it is not worth it.
3- LoRA training as a continuation of the extracted LoRA
I decided to try this experiment. Extract the LoRA with low rank > continue training on it with Kohya lora training.
This was still with the initial dataset (A) problems.
I did train both TE here.
It kind of worked, but it wasn't perfect. It got great body resemblance, but the face was still not quite there. And it did still require the high strength of 1.6 to work.
At this time, I had finally found a good Lora block testing tool (forge | ComfyUI) and I had created my own remerge block tool (here). So I decided to take this Lora and get it adjusted with a block analysis.
I reach the conclusion that this preset was quite good:
0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,1,1,1,1,1,1,1.15,1.15,1.15,1,1,0,0,0.25,0.25,0.25,0.25,0,0,0,0,0,0,0,0,0.5,0.25,0.25,1.5,1.5,1.25,1.25,0,0,0,0
I'm not quite sure, but I think I did a lot of "continuations" including some exclusive "tattoos" loras just to try to get the tattoo learning as a separate lora. I Still have these tattoo loras and I'm still analyzing if it makes sense to merge it into the final lora.
It works great as a complement with the other following loras, but not expressive enough that it deserved a merge (merging LoRas unfortunately never yells the same result as just generating with both, so for me, it gets finicky to choose and test).
SO this was the end of the road with this experiment, even though it is a good lora.
The best Aric might still actually be a combination of the 4 or 5 + this one.
The final result of this experiment is this lora (not published): adjusted03_AricT_L3-000004
4- Separated NEW LoRa training WITH target layers + better dataset (B) - NO Captions!
So these were the final 2 experiments. My initial goal was to focus on face now since my previous LoRA was good with body resemblance/proportions. But in the end, I didn't think merging was necessary and gave up on the previous lora (3).
I was deciding to make some radical training changes, and I tried the "no caption" strategy. Just captioned it "Aric". Not even "Aric man". Just Aric.
I also did NOT train the TE.
I pruned the dataset as explained above (B) and directed training to the layers that I analyzed that were good to prune the third (previous) LoRA experiment. Those were layers:
"train_double_block_indices": "2,3,4,15,16",
"train_single_block_indices": "0-12,15-18,27-33",
This was finally quite a success! As people said before, (and I'm still very caution here as I doubt that completely new and complex concepts would work) FOR CHARACTER resemblance, or simple concepts, captions seams unnecessary.
The best epoch was: 34. With 30 being almost as good. (out of 36): AricT_L4_Nocap-000034 (published here)
5- Separated NEW LoRa training WITH target layers + better dataset (B) - YES Captions!
After all these experiments, I had to try it with captions as well to compare it. I DID the EXACT same config for 4, but with captions.
And no I did not get the same epoch. Some people have published comparissons here and IMO it does not make sense to compare two loras like this. You need to first get the best epooch in each training and only then compare them. I've tested all of them. For the Yes caption lora, epoch 30 was the best one. Out of 36.
So This was ALSO quite a success! I cannot say this is better or worse that the nocap version. I think sometimes the resemblance is better, and sometimes it is not. The result is: AricT_L5_YsCap-000030 (publishede here)
What next?
I still want to do more experiments, specially with the tattoos. But I might move to something else for not. I still want to find a way to teach a LoRa perfect face resemblance and perfect correct tattoo designs. If anyone have a good workflow for training, please let me know!
Problems with the loras:
mingled tattoos
face resemblance is not always that great. Real Aric has a more silly non-standard face than what these models make. This is probably a flux problem, bringing the perfect model "handsome" face to every person.
The LoRas have less bleeding than most, probably because of the block restricted training. I bet it can get even better with a second post block analysis.
If the class is different enough, you can get it to work. Aric with a woman will bleed less. The full checkpoint is way better at controlling bleed.