Sign In

SomniumSC

31
466
5
Verified:
SafeTensor
Type
Checkpoint Trained
Stats
229
0
Reviews
Published
Mar 6, 2024
Base Model
Stable Cascade
Training
Epochs: 40
Hash
AutoV2
D41A1BC696
default creator card background decoration
SDXL Training Contest Participant
NCAI's Avatar
NCAI
License:
This Stability AI Model is licensed under the Stability AI Non-Commercial Research Community License, Copyright (c) Stability AI Ltd. All Rights Reserved.

The first high quality Anime style on Cascade is here. SomniumSC goal is to be waifu diffusion of Stable Cascade. Diffuser version could found in our huggingface too

On CivitAI, there is 2 file each weight size, which is fine tuned stage C, and fine tuned text encoder (Which is in zip). You should download both of them and extract the zip file to get .safetensors, so the model can be used on ComfyUI, the instruction is below. If you want to use our model in diffusers 🧨. Check our repo in huggingface, there is a code how to use it

Says goodbye for negative prompt and "word salad" in your positive prompt or hassle captioning. Start from SomniumSC v1.1, you don't need any prompt adjustment to generate stunning images and captioning is much simpler. Our model can generate good image even when no negative prompt on it. You can use negative prompt when there is undesired items on image like elf ear, or random halo.

You can support me on Ko-Fi

__________________________________________________________________________________________________

SomniumSC is fine-tuned model from all new stabilityAI model, Stable Cascade (Or we could say Würstchen v3) with a 2D (cartoonish) style is trained at Stage C 3.6B model. This model also trains the text encoder to generate a 2D style, so this model not only could generate using booru tag prompt, also you can use the natural language.

The model uses same amount and method of AnySomniumXL v2 used which has 33,000+ curated images from hundreds of thousands of images from various sources. The dataset is built by saving images that have an aesthetic score of at least 19 and a maximum of 50 (to maintain the cartoonish model and not too realistic. The scale is based on our proprietary aesthetic scoring mechanism), and do not have text and watermarks such as signatures or comic/manga images. Thus, images that have an aesthetic score of less than 17 and more than 50 will be discarded, as well as images that have watermarks or text will be discarded.

SomniumSC Technical Specification:

  • Training per 1 Epoch 40 Epoch (Results from SomniumSC using Epoch 40)

  • Captioned by proprietary multimodal LLM, better than LLaVA

  • Trained with a bucket size of 1024x1024; 1536;1536 (Multi res)

  • Shuffle Caption: Yes

  • Clip Skip: 0

  • Trained with 1x NVIDIA A100 80GB

The technology for creating this dataset uses a combination of the CLIP model and MLP scoring method by christophschuhmann and modified by us, utilizing VIT-L/14 to produce aesthetic scoring on a scale of -1-100 and modified with the addition of watermark detection from us.

Achievements in SomniumSC v1.1:

✓ Produces more 2D Models with Natural Language by default without the need for excessive negative or positive prompts

✓ Most likely to produce better fingers than the average stable diffusion model without adetailer or inpainting

✓ Produces a more authentic 2D model without the need for negative prompts like realistic

✓ Does not produce images with random watermarks or text

✓ Can produce better text even than AnySomniumXL v3.5.1

✓ Goodbye to “negative prompts”. You no longer need to use a negative prompt to prevent bad images unless there is an unwanted object

✓ Produces better colour than SomniumSC v1

✓ Much simple captioning

The difference between Stable Cascade and SDXL based model was, the model produce better finger, better hand, better feet, better fine detail of the characters, holding objects much better, and can generate up to 1536px. If you dare, you can generate using this model up to 2048px.

Limitations:

✓ Still requires broader dataset training for more variation of poses and style

✓ Text maximum words is only 2

✓ This optimized for human or mutated human generation. Non human like SCP, Ponies, and more maybe could resulting not what you expecting

✓ The faces maybe looks compressed. Generate the image at 1536px could be better

Smaller half size and stable cascade lite version will be released soon

How to use SomniumSC:

Currently Stable Cascade only supported by ComfyUI. But you can use our demo

You can use tutorial in here or here

To simplify which model should you download, I will provide you the where's to download model directly

For stage A you can download from here

For stage B you can download from here

For stage C you can download the safetensors on CivitAI or our huggingface repo

And the text encoder you download from our huggingface repo

SomniumSC Pro tips:

If the model producing pointy ears on the character, just add elf or pointy ears.

If the model producing "Compressed Face" use 1536px resolution, so the model can produce the face clearly.

Disclaimer:

This model is under STABILITY AI NON-COMMERCIAL RESEARCH COMMUNITY LICENSE. Which this model cannot be sold, and the derivative works cannot be commercialized. Except As far as I know, you can buy the membership of StabilityAI here To commercialize your derivative works based on this model. Please support StabilityAI, so they can always provide open source model for us. But still you can merge our model freely