home models images videos posts articles bounties challenges events updates shop

Haigaku-Medium

Name: Haigaku-Medium
Rating: 5 (20 reviews)
Author: NCAI

281

Updated: Mar 22, 2025

character

anime illustration

Download (4.76 GB)

Verified: 6 months ago

SafeTensor

Details

Type	Checkpoint Trained
Stats	224 0
Reviews	Positive (17)
Published	Mar 22, 2025
Base Model	SD 3.5 Medium
Training	Epochs: 18
Hash	AutoV2 0FC4921967

3 Files

About this version

default creator card background decoration

NCAI

License:

Stability AI Community License Agreement

The smaller version of Haigaku-Large, the first 2D style (Anime) model in SD 3.5 fine tuned model. Fine tuned from Stable diffusion 3.5 Medium model. Our initial target is to make this model able to be some kind of style "regularization" of LoRA tuning. This is initial version of the Haigaku, so maybe the model didn't generate perfect anatomy, but at least this this model are usable and could generate eye catching images

This model trained on 100k+ curated datasets from 2021-2024 Danbooru images including Natural Language captioned and retaining character names, the datasets will be added each major version update. Images are captioned by latest proprietary LLM that has lowest censoring level and no hallucination when encountering NSFW caption. Haigaku-Medium is fine tuned on Nvidia H100 with 800 effective batch sizes, while keeping three text encoders frozen (we didn't touch/fine tune the text encoder).

We are using latest method to curating our datasets, unlike our previous model (AnySomnium) which using CLIP+MLP predictor, instead we are using SigLIP based aesthetic predictor by discus0434. The dataset is built by saving images that have an aesthetic score of at least 5.5+ and a maximum of 10 based on recommended value from the author itself. SigLIP aesthetic predictor has better job to scoring illustration than the older one.

The fine tuning process using latest dynamic learning Rate strategy that will adjust the learning rate automatically to avoid overtraining or undertraining in efficient VRAM consumption during training, the fine tuning process also run with tf32 (full precision) data type and utilizing multi resolution strategy which is increasing the quality in broader resolution not only on 1024px. The trained resolution up to 1280px.

Model Information:

There are four version of the model. The pruned one mean no Text encoder are attached in the safetensors model, you can download the either in FP32 or BF16, and the full model is included the Text encoder

Haigaku-Medium also in development, with broader datasets. We will regularly update the model