Sign In

Belle Delphine - Pony

268
4.7k
133
Updated: Apr 26, 2025
celebrity
Verified:
SafeTensor
Type
LoRA
Stats
4,655
0
Reviews
Published
Jun 16, 2024
Base Model
Pony
Training
Steps: 80,640
Epochs: 18
Usage Tips
Strength: 1
Trigger Words
Belle Delphine
Hash
AutoV2
ADA6183EBB

This is a LoRA of the internet celebrity Belle Delphine for Pony Diffusion v6.

 

Trigger word: “Belle Delphine”.

Suggested LoRA weight: Depending on the style you want 0.4 – 1.0.

The model is relatively flexible considering both prompting and using in other finetunes based on Pony Diffusion.

 

Relevant additional tags might be:

  • “focused eyes” – Large open eyes

  • “braces” – Should be self explanatory

  • “snapchat” – If images were from snapchat, they were tagged, as these images have a lower quality and the text artifacts

Should people be interested in some other tags (although their accuracy / reproduction varies), I might also share them. They mostly relate to clothing.

 

Images were generated in SD.Next and should include metadata. I ran ADetailer for some of them, however it isn’t always a necessity (depending on the prompt and used model).

 

Training

 

As always, I will add a little bit about the training.

 

I curated a new dataset for SD3, however as training tools aren’t up to date yet, I decided to give it a run on Pony Diffusion, as I already have a base SDXL and SD 1.5 model.

 

The model you see here is not the first iteration. My initial trial was trained on the zonkey model, and it had impressive (accurate) results, however prompting was very brittle, for example if you left out “Belle Delphine” you would just get complete white noise. No idea how that happened. Also, it didn’t work at all with external LoRAs or on other models. (The training of that trial was with kohya, masked loss, DoRA).

 

So, when SD3 released and it didn’t look that special, I decided to retrain the dataset, this time on the Pony Diffusion base model hoping for more flexibility. I also switched to OneTrainer (which I already used to generate the initial masks).

 

The LoRA was trained on all images I could find (so quite many - in the low 5 digits). In addition to that, I created masks for the images in an automated fashion. I then tagged all images with wd-swinv2-tagger-v3 by SmilingWolf. For the SD3 dataset, I also labelled all images with several captions using a custom multimodal LLM workflow. (Sidenote: This took longer than training the LoRAs).

 

Then once again using a custom workflow, I clustered the created images based on their contents and added tags to those clusters.

 

I finally took all of this data, and trained it in OneTrainer with a random mixture of booru tags and natural language prompts.

Relevant training parameters:

  • Prodigy optimizer

  • 18 epochs @ 4480 steps

  • No image repeats (after all I had sufficiently many)

  • Batch size 4

  • Using image masks, with unmasked probability of 0.03, unmasked weight 0.02 (which causes the occasional watermark bleed through)

  • 1024 resolution with aspect bucketing

  • LoRA rank 96, alpha 2 (later resized to target 64, with sv_fro 0.98)

 

Training was done for about 48 hours on a RTX 4090.

 

If you have any additional questions feel free to ask. However, I used tons of custom- and purpose-built code here, and this is probably nothing I will share. However, I can give you some pointers should you be interested in doing something similar.

 

Disclaimer

 

I want to highlight again that this model is non-commercial, and you should only post images on CivitAI which follow the Content Rules.

Users are solely responsible for the content they generate using this LoRA. It is the user’s responsibility to ensure that their usage of this model adheres to all applicable local, state, national and international laws. I do not endorse any user-generated content and expressly disclaim any and all liability in connection with user generations.