Sign In

Megumi Kato PonyXL

103
894
586
13
Verified:
SafeTensor
Type
LoRA
Stats
163
59
Reviews
Published
Nov 12, 2024
Base Model
Pony
Training
Steps: 1,100
Trigger Words
katomegumi
Hash
AutoV2
B5C7BEFB03
default creator card background decoration
Maintenance Mode Contest #2
sumai's Avatar
sumai

20241112v4

Megumi Kato from Saenai Heroine no Sodatekata

Trigger Words: katomegumi

Trained with Pony Diffusion V6 XL checkpoint.

I have set a moderation on the gallery, so the NSFW content may not appear.

Due to "flagged for review", many images will be delayed by many hours before they can be viewed.

The original style has been enhanced, and the bad hand's problem has also been improved to some extent.

Many of the images I post have had their facial details repaired using ADetailer

Problems

  • Bad hands still existed.

  • Delete many outfits from dataset for a continuous style.

Usage tips

  • The model's output may heavily rely on the checkpoint.

  • The character's face, eyes, and pupils can be repaired through ADetailer.

  • You could find the example prompts from my post.

  • My prompts are basically composed in the order of [character traits] + [style] + [expression] + [clothing] + [camera and action] + [background], and you can delete or modify them as needed.

  • Recommended weight: 1~0.6, adjust as needed until the character's appearance meets your requirements.

  • Upscale value recommendation is around 1.2~2.0, denoising strength is 0.2

20241012v3.5

The training of this model and the images it generates are solely for learning purposes.

Megumi Kato from Saenai Heroine no Sodatekata

Trigger Words: megumi katou

Notice:

  • Add "3D" and "bad hands" to the negative prompt to increase the model's performance.

  • The model's output may heavily rely on the checkpoint.

  • You could find the example prompts from the my post. And the prompt of previous version's posts are also work, remember to replace the old trigger with new one.

  • Tags : animated, bloom, lens flare, could help the model generate images that are closer to an animated effect.

  • My prompts are basically composed in the order of [character traits] + [style] + [expression] + [clothing] + [camera and action] + [background], and you can delete or modify them as needed.

  • Recommended weight: 1~0.6, adjust as needed until the character's appearance meets your requirements.

  • Upscale value recommendation is around 1.2~2.0 , denoising strength is 0.2

  • Facial distortion may easily occur in situations such as full-body shots. If there is facial distortion, consider using ADetailer for repair

Log:

There are still many details that haven't been trained properly, but I've already spent too much time on this character, so it's time to wrap it up.

As for the issue of the character's pupils being blurry, I used many techniques but still couldn't get the model to learn it. Using ADetailer for fixing might bring some improvement.

The character's hands also often turn out poorly, but they're much better than before.

Throughout the training process, the most challenging part for me was having a dataset that contained multiple styles at once (even though the collected images were as similar in style as possible, there are still significant differences upon closer inspection). At the same time, I wanted the model to learn the style that had the greatest impact on the target outcome. To be honest, I still don't have a clear idea on how to achieve this.

20241007

Trigger Words: megumikato

If improvements are to be made, the next goals would be:

  • Enhance the sense of depth and spatiality in the image. The current model's style is too flat.

  • Increase the training repetition of a specific piece of clothing or concept.

  • Adding new outfits.

Log:

Conclusion: The dataset and its captions are the most important things in training.

I feel like I've retrained it more than 40 times now. The model has achieved a result that I'm satisfied with. I've strengthened the model's style and also alleviated the problem of limb distortion and hand collapse in the previous two versions.

The previous version had a problem where the character image was very prone to collapse, and the continuity of the character style was also very poor. Many details were not learned by the model. For example, the pupil of Kato Megumi, no matter how the prompt words were written, the final result was always blurred.

To solve these problems, I tried many methods. At the beginning, I used tags to distinguish, adding corresponding tags to the pictures of different styles in the dataset, and at the same time, using a strengthened subset to put the pictures of the target style in it, in order to influence the overall style of the model and solve the problem of style consistency. But this method did not work well, and the effect was still very bad.

Then I tried to continue to increase the data in the dataset, and the dataset was still mainly pictures that are in line with the original style.

At the same time, I gradually increased the training steps, and the highest was 5000 steps. The final result was that the model's style was stable, but the phenomenon of limb distortion of the character was very serious. There are overtrain.

After this process, I should have retrained it more than 40 times, and the result still did not improve.

Finally, I began to realize that the problem should be with the dataset. Many pictures in the dataset are very gorgeous, especially some hand close-ups, which are not difficult for humans to recognize as an exaggerated perspective style, but it is completely different for machines.

So the real problem is: for machines, many pictures in the dataset do not have a high reference value.

So I cleaned up the dataset again, adding many reference pictures that are in line with three-dimensional perspective (mainly from the character cards of koikatsu, thank all the creators who share the character cards). At the same time, I kept the tag distinction, and the training steps were also reduced back to around 1000 steps, and the current effect was achieved.

However, this model also has a data pollution problem. I made a mistake in the tag of one of the data, which affected the model's performance on the "blue school blazer" tag. The effect generated by using this tag is easy to become green instead of blue.

20241001

Trigger Words: katomegumi

Switching to a different checkpoint for retraining, I have to admit the impact of the checkpoint is really significant!

I also redrew and manually repaired the poor-quality images in the dataset to high definition and retrained at least 8 times.

However, the model's learning of tags for complex actions is still average; it is prone to distortions in body parts.

20240928

Megumi Kato from Saenai Heroine no Sodatekata

Trigger Words: megumi kato

Recently, I've been so busy with work that I haven't had much time to train models.

Today, I finally had a break and spent some time collecting a dataset and training this model.

Due to some image quality issues in the dataset, this model sometimes generates images of poor quality. And, as an old problem, the hands are prone to distortion.