This is just an article for my Onibi Series loras for Pony and Illustrious.
I've been wanting to go over the details and changes in different versions, but never got around to it, so they'll be covered in this document!
...by the way, this probably won't be structured well.
First of all, what characters are included in the models, and which characters are not?
I can proudly say that all characters appearing in the series' PVs are included. Characters who don't appear in PVs are not included. The one exception being Tsukuyomi, who isn't in any of the PVs but has an official design. (obviously mentalism maria doesn't count)
Therefore, until they're given official designs (and have enough data), characters like Shizune, Tomonari, Akari's brother, and so forth will not be included.
What are the changes between V1 and V2? [Pony]
Well, the quickest to explain is that I trained for more steps, so there was a natural improvement. It was also trained onsite rather than via google colab, but whether that's an improvement is debatable...
The main difference is that I cleaned up the dataset a fair amount. Since auto-tagging had been used, there were a lot of issues and inconsistencies with the tags, so I went in and tried to fix certain problems.
I also gave some concepts their own tags! My bad for never mentioning it, but I added a tag for each song in the series, so the scenarios from those songs can be vaguely recreated. Here are the tags used:
onigumo to kitsune no shishi to
kitsune no yomeiri
kimikage enbukyou
amanojaku song (shikyou amanojaku, but named this way so it wouldn't get mixed with shikyou's character tag)
kubinashi enbukyou
meikyou oniwarabe
himeyuri enbukyou
onibi song
On the other hand, I also added tags for Akari's different appearances across different songs. Such as:
ongmakari (Akari in onigumo to kitsune no shishi to)
knyakari (Akari in kitsune no yomeiri)
tfwakari (Akari in the fox's wedding remix)
hmyrakari (Akari in himeyuri enbukyou)
(Even in the most recent version of the lora, I didn't make a tag for her canon design for some reason, so that's probably something I'll add in the next update... whenever that happens)
As for something that actually turned out useful, I made a tag called "akari kitsune mask" so her specific mask design can be generated instead of a generic fox mask.
I also added an "official art" tag to generate images more accurate to the original style of the series' art.
What are the changes between V2 and V3? [Pony]
I'd say V3 was the biggest update between versions so far.
The first and foremost obvious change is that finally, characters' physical features were pruned into their tags. Basically, for each character, I removed every tag that describes the character's hair and eye colour, hairstyle and length, etc. from the dataset so that the model would 'understand' that those features are innately part of that character, and therefore don't need to be specified in the prompt. This was done to make prompting much quicker due to requiring less tags to be typed in.
For one example, Mai's character prompt from V1-2:
"shishikusa mai, green hair, green eyes, short hair, red bow, hair bow, beige kimono, red hakama, hakama short skirt, black thighhighs"
was shortened to simply:
"shishikusa mai, mai hair bow, mai hakama, black thighhighs".
(As you can see, clothing tags still have to be specified because they're not an unchanging part of the character. However, the clothing tags themselves are also shortened for convenience)
This also helps to make the characters more consistent and prevent different characters' features from getting mixed together, a common flaw multi-character loras have.
Speaking of characters' features getting mixed together, this is the reason Ryou and Tsukuyomi's tags were shortened to "shishiryou" and "oborotsukuyomi" respectively. The previous tags "shishikusa ryou" and "oborozuka tsukuyomi" were occasionally problematic because they were too similar to Mai and Akari's tags respectively, which 'confused' the model.
(...by the way, Akari kept her hair colour/style tags because they do change.)
The second big change was equally important, if not more important. This being the decision to separate characters' training data into different folders instead of having them all in the same folder. The reason is that on the colab trainer I use, you can set different amounts of repeats for different folders. So what I did was set characters with more images to have less repeats, and characters with less images to have more repeats. This effectively (albeit not perfectly) makes sure every character is produced equally even if they have less data.
And of course, it was trained for more steps than V2, and trained on colab instead of civitai, so it was improved by default.
The upgrade to Illustrious
At the end of November last year, I discovered that Illustrious-XL is frankly far superior to Pony Diffusion in just about every way, at least for anime-style art. Output quality is better, tagging is better, character knowledge is better, newer dataset, much more style flexibility, you don't need those damn score_up tags at the start of every prompt, etc... the one thing I can say is better about Pony is that the base model on its own is more usable than base Illustrious, but that's not a big deal when finetunes exist. Pony is also better at some certain "niche interests", but I will not be elaborating.
Conclusion for now
I'd say the Illustrious version is the best version of the lora so far. I'm not sure what major improvements can be made going forward, but there is room for some tweaks and maybe trying out a new base model...
With that said, here's a sneak peek at what's next.