Hello Everyone!
This is my baby - Serenity v2!
My first model (v1) was a merge of ~18 photorealistic models that I liked and the result turned out quite good.
But I always knew that if I ever made v2 - it could not be just a merge of models. The first time it was to learn how those things are done. Doing a merge for a second time would be just lazy.
I knew I had to fine-tune along with the merging. There was no way around it - I wanted to include some new material into the mix so it wouldn't be just a rehash of existing models.
So, in the end, what I did was. I merged 45 models (previously it was only 18) out of which 3 were finetuned by myself.
My goals were:
include new material via finetuning
be an improvement over Serenity v1
be good at photorealistic generations
be a great base for other models - LoRAs, LyCORIS, and Textual Inversions
Without humility, I can say that I achieved all the objectives :-)
The model has finished a couple of weeks ago and it was sent first to my alpha testers and then published on my buymeacoffe supporters magaupload site.
People tested and said that they indeed feel like this model is better than Serenity v1.
During those weeks I was also testing it myself and I can confirm what others have said. I do feel like this is an improvement. It is not a great improvement, but it is an improvement nonetheless. And I'm happy for it.
We are at a stage where the base models are good already, so pushing it a bit further is not as easy. If there would be no improvement (or if it came out worse) I would not be releasing it.
To be fair - it is currently my go-to model (I am, of course, biased) :)
As usual, there will be no secrets. I will explain what I did so people who want to push it further will have an easier way.
I started with finetuning, I picked Serenity v1 as a base. I did several finetune but the ones I kept (and used for merging) were finetunes with 50k steps, 100k steps, and 150k steps.
The last (biggest) fine-tuning ran for over 10 hours using 3090 TI.
The finetunes had different step counts as well as different dataset sizes. The first one was around 10k new images, the second one was around 20k and the last one was around 45k new images.
The initial dataset consisted mainly of my most famous people datasets (picked the ones that are indeed most famous), and the second one included even more famous people as well as my concepts datasets.
For the final dataset, I added two additional kinds of folders, one with random high-quality photos of people and objects and the other one - with high-quality metart images.
Here is the time for an inconvenient truth. If we want to make a real representation of a human body (not necessarily a lewd one) - we need the full body in various shapes, positions, angles, and distances.
My first model had as a part of the merge some models that were marked as NSFW and this one is no exception. By adding the metart photos (for those not familiar - those are the tactful but erotic images) the anatomy of a person went a step further.
So, in those 45 merged models, we have 3 models that were finetuned and were based on Serenity. This choice let me keep the "look and feel" of the original Serenity but at the same time - be able to push it further!
One fun fact about the finetunes. Those on their own were quite interesting. I was able to prompt some of the famous people straight away and would be getting outputs on part with dreambooth quality. But this was not the case for all names, some were better than in other models but still not close enough to the source material.
However - using LoRAs or LyCORIS would have a bad effect, the outputs would come out too strong. I knew I had to dilute it by merging it with other models.
And diluted the model I did, I will list the whole process way below so you can see which models I used in case you want to include it in your merge or just check it on your own :)
If you are interested in the finetuning process - I used my standard fork of diffusers (Shivam Shirao -> Inb4DevOps), the images had to be captioned, they were all cut to 512x512 and I used Serenity v1 as a base.
As previously mentioned, I used a lot of my celebrities' datasets so the generation of any faces should be improved. Besides that - it also includes my concepts like fire, water, eyes, mouth, etc. And lastly, it includes various images and metart images.
The MetArt images were captioned additionally with the token "metart" so using "metart" or "((metart))" might get you interesting results.
Prompting for tokens from my concepts also works, here I think you can use it with a bit lower weight (for example: "(perfect eyes:0.5)"). You can also prompt for famous people and some do shine even without using Loras/Lycoris/Embeddings.
I feel like I need to make a warning: what you do locally is none of my (or anyone else) concern, but be mindful and respectful. Do not publish anything that you wouldn't anyone else to publish of you. And of course, obey your local laws.
You are given a powerful tool (well, not like you didn't have means before, my model is nothing special in the grand scheme of things) - use it wisely! If that felt a bit preachy - I'm sorry, but some people have to be told that, I speak from experience.
Having said that - I hope you will have a wonderful experience with my model. I do have some partial models that I'm thinking of uploading, but you can let me know in the comments if you are interested in them (or I may just upload them regardless if I feel like it would be a benefit to the community).
Besides the finetune, there was this whole merging process. I went through ~40 models (the other ones were my finetunes and Serenity v1) that I liked (some that were used in Serenity v1 but now with newer versions and some completely new).
I merged them in threes, so 45 became 15 models, I checked if those 15 models looked decent, then I merged them into 5 and checked even more if they looked fine, I added a finetune there and merged them into 2 models.
To those two models, I added the final finetune and merged the three models into Serenity v2. I call the first two models RC1 and RC2 and they are solid on their own (those are the models I feel like sharing later).
Sadly, I've looked through my files now and it seems I do not have all the steps saved so I won't be able to share it but I do have a list of models used (it excludes the the finetunes and serenity itself):
absolutereality_v181-81458\absolutereality_v181.safetensors
amIPerfection_photorealPerfection-86350\amIPerfection_photorealPerfection.safetensors
amIReal_V41-49463\amIReal_V41.safetensors
analogMadness_v60-8030\analogMadness_v60.safetensors
awportrait_v12-61170\awportrait_v12.safetensors
aZovyaPhotoreal_v2-57319\aZovyaPhotoreal_v2.safetensors
babes_20-2220\babes_20.safetensors
batchcoREALISM_v30-159627\batchcoREALISM_v30.safetensors
beautifulArt_v70-85492\beautifulArt_v70.safetensors
beautyfoolReality_v30-108111\beautyfoolReality_v30.safetensors
clarity_3-5062\clarity_3.safetensors
cyberrealistic_v40-15003\cyberrealistic_v40.safetensors
Degenerate_deliberateV1-19831\Degenerate_deliberateV1.safetensors
dreamshaper_8-4384\dreamshaper_8.safetensors
edgeOfRealism_eorV20Fp16BakedVAE-21813\edgeOfRealism_eorV20Fp16BakedVAE.safetensors
epicphotogasm_zUniversal-132632\epicphotogasm_zUniversal.safetensors
epicrealism_naturalSinRC1VAE-25694\epicrealism_naturalSinRC1VAE.safetensors
fusioncoreModern_v05-144475\fusioncoreModern_v05.safetensors
hassanblend1512And_hassanblend1512-1173\hassanblend1512And_hassanblend1512.safetensors
humans_v10-98755\humans_v10.safetensors
icbinpICantBelieveIts_seco-28059\icbinpICantBelieveIts_seco.safetensors
juggernaut_aftermath-46422\juggernaut_aftermath.safetensors
lazymixRealAmateur_v30b-10961\lazymixRealAmateur_v30b.safetensors
limitlessvision_v20-141348\limitlessvision_v20.safetensors
lofi_v4-9052\lofi_v4.safetensors
moomoofusion_v40Female-133364\moomoofusion_v40Female.safetensors
photogenesis_v30-115479\photogenesis_v30.safetensors
photon_v1-84728\photon_v1.safetensors
pornmasterAmateur_fp16V6-82543\pornmasterAmateur_fp16V6.safetensors
pornvision_final-41636\pornvision_final.safetensors
pureperfection_v20-53491\pureperfection_v20.safetensors
qgoPromptingreal_qgoPromptingrealV1-4188\qgoPromptingreal_qgoPromptingrealV1.safetensors
realisticDigital_v50-139300\realisticDigital_v50.safetensors
realisticVisionV51_v51VAE-4201\realisticVisionV51_v51VAE.safetensors
rundiffusionFX_v10-82972\rundiffusionFX_v10.safetensors
subredditV7_v70-80819\subredditV7_v70.safetensors
uberRealisticPornMerge_urpmv13-2661\uberRealisticPornMerge_urpmv13.safetensors
unrealityV30_v30-12967\unrealityV30_v30.safetensors
wyvernmix15XL_xlV18-5273\wyvernmix15XL_v9.safetensors
My process was as follows, I had 45 models and had to divide them into 15 groups of 3. I picked three labels: "photorealism", "human body", and "style" and I grouped the models by using their primary specialization. Had to be a bit flexible with that as I needed those groups to be even.
Then I prepared those 15 groups by picking one model from each of those three baskets (so on the initial merging I would always merge "photorealism" with "human body" and with "style").
For the merging, I used the same tool as with Serenity v1: Chattiori-Model-Merger (https://github.com/Faildes/Chattiori-Model-Merger.git)
The script would be called as follows:
Merging three models into one (45 -> 15, 15 -> 5, 5+1 -> 2):
python merge.py "ST" "D:/Development/StableDiffusion/SDModels/Models/merging/part1" "model1.safetensors" "model2.safetensors" --model_2 "model3.safetensors" --alpha 0.33 --beta 0.33 --save_safetensor --save_half --output "D:/Development/StableDiffusion/SDModels/Models/merging/outputmodel1"
Merging two models into one (2 -> 1)
python merge.py "WS" "D:/Development/StableDiffusion/SDModels/Models/merging/part3/" "merged-st-model-rc-1.safetensors" "merged-st-model-rc-2.safetensors" --alpha 0.5 --save_safetensor --save_half --output "D:/Development/StableDiffusion/SDModels/Models/merging/part3/merged-st-model-final"
Some samples are attached to the article, so you can download the zip files and get all the prompt data. If you just want to look at them, here is an Imgur gallery: https://imgur.com/gallery/HrR0AZQ