Sign In

New version of my base model Serenity V2 (for SD 1.5)


Hello Everyone!

This is my baby - Serenity v2!

My first model (v1) was a merge of ~18 photorealistic models that I liked and the result turned out quite good.

But I always knew that if I ever made v2 - it could not be just a merge of models. The first time it was to learn how those things are done. Doing a merge for a second time would be just lazy.

I knew I had to fine-tune along with the merging. There was no way around it - I wanted to include some new material into the mix so it wouldn't be just a rehash of existing models.

So, in the end, what I did was. I merged 45 models (previously it was only 18) out of which 3 were finetuned by myself.

My goals were:

  • include new material via finetuning

  • be an improvement over Serenity v1

  • be good at photorealistic generations

  • be a great base for other models - LoRAs, LyCORIS, and Textual Inversions

Without humility, I can say that I achieved all the objectives :-)

The model has finished a couple of weeks ago and it was sent first to my alpha testers and then published on my buymeacoffe supporters magaupload site.

People tested and said that they indeed feel like this model is better than Serenity v1.

During those weeks I was also testing it myself and I can confirm what others have said. I do feel like this is an improvement. It is not a great improvement, but it is an improvement nonetheless. And I'm happy for it.

We are at a stage where the base models are good already, so pushing it a bit further is not as easy. If there would be no improvement (or if it came out worse) I would not be releasing it.

To be fair - it is currently my go-to model (I am, of course, biased) :)

As usual, there will be no secrets. I will explain what I did so people who want to push it further will have an easier way.

I started with finetuning, I picked Serenity v1 as a base. I did several finetune but the ones I kept (and used for merging) were finetunes with 50k steps, 100k steps, and 150k steps.

The last (biggest) fine-tuning ran for over 10 hours using 3090 TI.

The finetunes had different step counts as well as different dataset sizes. The first one was around 10k new images, the second one was around 20k and the last one was around 45k new images.

The initial dataset consisted mainly of my most famous people datasets (picked the ones that are indeed most famous), and the second one included even more famous people as well as my concepts datasets.

For the final dataset, I added two additional kinds of folders, one with random high-quality photos of people and objects and the other one - with high-quality metart images.

Here is the time for an inconvenient truth. If we want to make a real representation of a human body (not necessarily a lewd one) - we need the full body in various shapes, positions, angles, and distances.

My first model had as a part of the merge some models that were marked as NSFW and this one is no exception. By adding the metart photos (for those not familiar - those are the tactful but erotic images) the anatomy of a person went a step further.

So, in those 45 merged models, we have 3 models that were finetuned and were based on Serenity. This choice let me keep the "look and feel" of the original Serenity but at the same time - be able to push it further!

One fun fact about the finetunes. Those on their own were quite interesting. I was able to prompt some of the famous people straight away and would be getting outputs on part with dreambooth quality. But this was not the case for all names, some were better than in other models but still not close enough to the source material.

However - using LoRAs or LyCORIS would have a bad effect, the outputs would come out too strong. I knew I had to dilute it by merging it with other models.

And diluted the model I did, I will list the whole process way below so you can see which models I used in case you want to include it in your merge or just check it on your own :)

If you are interested in the finetuning process - I used my standard fork of diffusers (Shivam Shirao -> Inb4DevOps), the images had to be captioned, they were all cut to 512x512 and I used Serenity v1 as a base.

As previously mentioned, I used a lot of my celebrities' datasets so the generation of any faces should be improved. Besides that - it also includes my concepts like fire, water, eyes, mouth, etc. And lastly, it includes various images and metart images.

The MetArt images were captioned additionally with the token "metart" so using "metart" or "((metart))" might get you interesting results.

Prompting for tokens from my concepts also works, here I think you can use it with a bit lower weight (for example: "(perfect eyes:0.5)"). You can also prompt for famous people and some do shine even without using Loras/Lycoris/Embeddings.

I feel like I need to make a warning: what you do locally is none of my (or anyone else) concern, but be mindful and respectful. Do not publish anything that you wouldn't anyone else to publish of you. And of course, obey your local laws.

You are given a powerful tool (well, not like you didn't have means before, my model is nothing special in the grand scheme of things) - use it wisely! If that felt a bit preachy - I'm sorry, but some people have to be told that, I speak from experience.

Having said that - I hope you will have a wonderful experience with my model. I do have some partial models that I'm thinking of uploading, but you can let me know in the comments if you are interested in them (or I may just upload them regardless if I feel like it would be a benefit to the community).

Besides the finetune, there was this whole merging process. I went through ~40 models (the other ones were my finetunes and Serenity v1) that I liked (some that were used in Serenity v1 but now with newer versions and some completely new).

I merged them in threes, so 45 became 15 models, I checked if those 15 models looked decent, then I merged them into 5 and checked even more if they looked fine, I added a finetune there and merged them into 2 models.

To those two models, I added the final finetune and merged the three models into Serenity v2. I call the first two models RC1 and RC2 and they are solid on their own (those are the models I feel like sharing later).

Sadly, I've looked through my files now and it seems I do not have all the steps saved so I won't be able to share it but I do have a list of models used (it excludes the the finetunes and serenity itself):

  • absolutereality_v181-81458\absolutereality_v181.safetensors

  • amIPerfection_photorealPerfection-86350\amIPerfection_photorealPerfection.safetensors

  • amIReal_V41-49463\amIReal_V41.safetensors

  • analogMadness_v60-8030\analogMadness_v60.safetensors

  • awportrait_v12-61170\awportrait_v12.safetensors

  • aZovyaPhotoreal_v2-57319\aZovyaPhotoreal_v2.safetensors

  • babes_20-2220\babes_20.safetensors

  • batchcoREALISM_v30-159627\batchcoREALISM_v30.safetensors

  • beautifulArt_v70-85492\beautifulArt_v70.safetensors

  • beautyfoolReality_v30-108111\beautyfoolReality_v30.safetensors

  • clarity_3-5062\clarity_3.safetensors

  • cyberrealistic_v40-15003\cyberrealistic_v40.safetensors

  • Degenerate_deliberateV1-19831\Degenerate_deliberateV1.safetensors

  • dreamshaper_8-4384\dreamshaper_8.safetensors

  • edgeOfRealism_eorV20Fp16BakedVAE-21813\edgeOfRealism_eorV20Fp16BakedVAE.safetensors

  • epicphotogasm_zUniversal-132632\epicphotogasm_zUniversal.safetensors

  • epicrealism_naturalSinRC1VAE-25694\epicrealism_naturalSinRC1VAE.safetensors

  • fusioncoreModern_v05-144475\fusioncoreModern_v05.safetensors

  • hassanblend1512And_hassanblend1512-1173\hassanblend1512And_hassanblend1512.safetensors

  • humans_v10-98755\humans_v10.safetensors

  • icbinpICantBelieveIts_seco-28059\icbinpICantBelieveIts_seco.safetensors

  • juggernaut_aftermath-46422\juggernaut_aftermath.safetensors

  • lazymixRealAmateur_v30b-10961\lazymixRealAmateur_v30b.safetensors

  • limitlessvision_v20-141348\limitlessvision_v20.safetensors

  • lofi_v4-9052\lofi_v4.safetensors

  • moomoofusion_v40Female-133364\moomoofusion_v40Female.safetensors

  • photogenesis_v30-115479\photogenesis_v30.safetensors

  • photon_v1-84728\photon_v1.safetensors

  • pornmasterAmateur_fp16V6-82543\pornmasterAmateur_fp16V6.safetensors

  • pornvision_final-41636\pornvision_final.safetensors

  • pureperfection_v20-53491\pureperfection_v20.safetensors

  • qgoPromptingreal_qgoPromptingrealV1-4188\qgoPromptingreal_qgoPromptingrealV1.safetensors

  • realisticDigital_v50-139300\realisticDigital_v50.safetensors

  • realisticVisionV51_v51VAE-4201\realisticVisionV51_v51VAE.safetensors

  • rundiffusionFX_v10-82972\rundiffusionFX_v10.safetensors

  • subredditV7_v70-80819\subredditV7_v70.safetensors

  • uberRealisticPornMerge_urpmv13-2661\uberRealisticPornMerge_urpmv13.safetensors

  • unrealityV30_v30-12967\unrealityV30_v30.safetensors

  • wyvernmix15XL_xlV18-5273\wyvernmix15XL_v9.safetensors

My process was as follows, I had 45 models and had to divide them into 15 groups of 3. I picked three labels: "photorealism", "human body", and "style" and I grouped the models by using their primary specialization. Had to be a bit flexible with that as I needed those groups to be even.

Then I prepared those 15 groups by picking one model from each of those three baskets (so on the initial merging I would always merge "photorealism" with "human body" and with "style").

For the merging, I used the same tool as with Serenity v1: Chattiori-Model-Merger (

The script would be called as follows:

Merging three models into one (45 -> 15, 15 -> 5, 5+1 -> 2):

python "ST" "D:/Development/StableDiffusion/SDModels/Models/merging/part1" "model1.safetensors" "model2.safetensors" --model_2 "model3.safetensors" --alpha 0.33 --beta 0.33 --save_safetensor --save_half --output "D:/Development/StableDiffusion/SDModels/Models/merging/outputmodel1"

Merging two models into one (2 -> 1)

python "WS" "D:/Development/StableDiffusion/SDModels/Models/merging/part3/" "merged-st-model-rc-1.safetensors" "merged-st-model-rc-2.safetensors" --alpha 0.5 --save_safetensor --save_half --output "D:/Development/StableDiffusion/SDModels/Models/merging/part3/merged-st-model-final"

Some samples are attached to the article, so you can download the zip files and get all the prompt data. If you just want to look at them, here is an Imgur gallery: