Guide to creating Character LoRAs with extremely small low quality dataset.

What if your character is extremely niche, and only few pictures of him/her exist? And those pictures are low quality?

I faced this challenge, and here is my approach.

____________________________________________________________________________________________

tl;dr

Train a Flux LoRA with the small, low quality dataset using extremely minimalist tags.
Use a captioning LLM such as chatGPT to create a detailed description of the character using original images
Use the Flux LoRA combined with the detailed description above to generate a new dataset.
Use the new dataset to train a PonyXL model

____________________________________________________________________________________________

So I was commissioned to train a model of Piety from Path of Exile.

This is the dataset I had to work with

Full dataset can be seen downloaded form this model page here.

The dataset consisted of 2 decent pictures of her, and screenshots of her low polygon in game model.

I tagged this dataset manually using minimalist tagging:

3d model of p13ty_PoE.  Low polygons, low quality. Character design.

The character stands in a T-pose, emphasizing the full extent of her costume and detailed armor.

Especially, I did NOT tag her attire.

At this point I used the best picture in the dataset, uploaded it to ChatGPT (any other LLM will work fie, or you can do it manually) and asked to give me a very detailed description.

I removed irrelevant parts, and kept the description of her attire.

 warrior dressed in ornate, medieval-style armor extend to her elbows, providing full arm coverage. The helmet has a wide, curved shape with a ridged edge at the top, and the front features small cutout patterns for breathability or decoration. The face of the warrior is partially obscured by the helmet, but their mouth and lower chin are visible.

At this point I had a series of Flux model trained at different epochs and I used them randomly combined with the detailed description as the prompt.

I tried adding different art style (e.g. Anime), and position to diversify the dataset.

I got roughly 140 images, (I let it run overnight, fewer images would have been fine). Of those, 120 were good quality.

I tagged them for Pony. Again, I avoided describing the outfit. Just a trigger word, medium/close-up shot, style, background.

And started the pony training.

It worked surprisingly well.

You can see my intermediary Flux model HERE and my final PonyXL model HERE .

____________________________________________________________________________________________

Errors and pitfalls I encountered that should be avoided in the future.

Some of the Flux images had the wrong colour of the armor. I ignored and threw it into trainer, that made the Pony model even more unstable in terms of armor colours. I had to insert the trigger word "red uniform" to compensate.

Most of the flux images were generating front view. The resulting Pony model had a strong bias towards making a front view symmetrical images, Next time I need to diversify the position and angles of my generated dataset.

Guide to creating Character LoRAs with extremely small low quality dataset.

Comments