Sign In

AnySomniumXL

73
871
10
Verified:
SafeTensor
Type
Checkpoint Trained
Stats
202
Reviews
Published
Feb 6, 2024
Base Model
SDXL 1.0
Hash
AutoV2
14983DE617
default creator card background decoration
SDXL Training Contest Participant
NCAI's Avatar
NCAI

[Proudly introducing, AnySomniumXL v3, an SDXL Model]

You can support me on Ko-Fi

The SDXL model with a 2D (cartoonish) style is trained with the basic SDXL model (SDXL Base v1.0), supported by text encoder training to generate a 2D style with natural language and likely not generate the realistic style inherent in SDXL Base.

The model is trained with 133,000+ curated images from hundreds of thousands of images from various sources. The dataset is built by saving images that have an aesthetic score of at least 17 and a maximum of 50 (to maintain the cartoonish model and not too realistic. The scale is based on our proprietary aesthetic scoring mechanism), and do not have text and watermarks such as signatures or comic/manga images. Thus, images that have an aesthetic score of less than 17 and more than 50 will be discarded, as well as images that have watermarks or text will be discarded.

AnySomniumXL v3 Technical Specifications:

  • Training per 1 Epoch 16 Epoch (Results from AnySomniumXL using Epoch 16)

  • Captioned by proprietary multimodal LLM, better than LLaVA

  • Trained with a bucket size of 1280x1280

  • Shuffle Caption: Yes

  • Clip Skip: 2

  • Trained with 2x NVIDIA A100 80GB

The technology for creating this dataset uses a combination of the CLIP model and MLP scoring method by christophschuhmann and modified by us, utilizing VIT-L/14 to produce aesthetic scoring on a scale of -1-100 and modified with the addition of watermark detection from us.

Achievements:

✓ Produces more 2D Models with Natural Language by default without the need for excessive negative or positive prompts

✓ Most likely to produce better fingers than the average stable diffusion model without adetailer or inpainting

✓ Produces a more authentic 2D model without the need for negative prompts

✓ Does not produce images with random watermarks or text

Limitations:

✓ Slightly of characters holding objects such as weapons or items correctly

✓ Still requires broader dataset training

✓ There are still some gaps in the text encoder. There is room for improvement

✓ Text cannot generated correctly

✓ This optimized for human or mutated human generation. Non human like SCP, Ponies, and more maybe could resulting not what you expecting

AnySomniumXL v3 Pro tips:

Because AnySomniumXL v3 trained on 1280x1280, so the resolution on many aspects ratio maybe different than standard SDXL model

Best Resolution (You could flip the resolution number whether it's landscape or portrait):

  • 1280x1280

  • 1472x1088

  • 1152x1408

  • 1536x1024

  • 1856x832

  • 1024x1600

More versions will be coming with broader datasets and trained text encoder. Our targets is to produce the most enormous clean datasets for our training. It's recommended to using this model on Automatic1111 webui