Sign In

Haigaku-Medium

20

281

3

Updated: Mar 22, 2025

characteranimeillustration

Verified:

SafeTensor

Type

Checkpoint Trained

Stats

224

0

Reviews

Published

Mar 22, 2025

Base Model

SD 3.5 Medium

Training

Epochs: 18

Hash

AutoV2
0FC4921967
default creator card background decoration
SDXL Training Contest Participant
NCAI's Avatar

NCAI

This Stability AI Model is licensed under the Stability AI Community License, Copyright (c) Stability AI Ltd. All Rights Reserved.

Powered by Stability AI

The smaller version of Haigaku-Large, the first 2D style (Anime) model in SD 3.5 fine tuned model. Fine tuned from Stable diffusion 3.5 Medium model. Our initial target is to make this model able to be some kind of style "regularization" of LoRA tuning. This is initial version of the Haigaku, so maybe the model didn't generate perfect anatomy, but at least this this model are usable and could generate eye catching images

This model trained on 100k+ curated datasets from 2021-2024 Danbooru images including Natural Language captioned and retaining character names, the datasets will be added each major version update. Images are captioned by latest proprietary LLM that has lowest censoring level and no hallucination when encountering NSFW caption. Haigaku-Medium is fine tuned on Nvidia H100 with 800 effective batch sizes, while keeping three text encoders frozen (we didn't touch/fine tune the text encoder).

We are using latest method to curating our datasets, unlike our previous model (AnySomnium) which using CLIP+MLP predictor, instead we are using SigLIP based aesthetic predictor by discus0434. The dataset is built by saving images that have an aesthetic score of at least 5.5+ and a maximum of 10 based on recommended value from the author itself. SigLIP aesthetic predictor has better job to scoring illustration than the older one.

The fine tuning process using latest dynamic learning Rate strategy that will adjust the learning rate automatically to avoid overtraining or undertraining in efficient VRAM consumption during training, the fine tuning process also run with tf32 (full precision) data type and utilizing multi resolution strategy which is increasing the quality in broader resolution not only on 1024px. The trained resolution up to 1280px.


Model Information:

There are four version of the model. The pruned one mean no Text encoder are attached in the safetensors model, you can download the either in FP32 or BF16, and the full model is included the Text encoder

Haigaku-Medium also in development, with broader datasets. We will regularly update the model