Sign In

AI Style Training - My insights on AI Style training on Dataset preparation + Experiment inside

AI Style Training - My insights on AI Style training on Dataset preparation + Experiment inside

Hi there!

First of all -and this is important- this is not how I use to train, neither is a guide oriented to settings or parameters this is just a cover-to-cover process of an experiment I did and I don't recommend anyone to train like this. However it will give a very good insight of my brain during trainings.

As pro-tip if you want to do some cool trainings use clean, sharp, 3d renders, good quality and detailed images (MidJourney clean outputs works here). Just remember to not overfit the model an you will be good.

Also as a fair tip and just as personal thing, avoid to use just one artist in the training and use several with a main concept/genre in mind, simplifies everything. If not you will be dealing with lot of problems.

So once again... this is not about settings or parameters. In fact in CivitAI there are some very good guides that you can use. This guide is more a way of thinking rather than combinations of learning rates and so on. A good point here is that I always change things here and there and do several trainings to get the version I want or is closer to what I have in my head.

Short thing: Dataset + Dataset Prep. >>>>>>>>> everything.

--


Before training a style you need to figure which will be the sources you will be using for the training. If you will use only original sources, scans, crappy pictures (not please), generative works, mix of everything, etc. I wrote a mini table of the multiple ways to train an Style (being SD 1.4/1.5 or SDXL), probably there are more but these are my main go-to-go when I try to train a certain focused style. Let's give it a view:

[Method A] Pure style-work (Simulation of an existant style in real life)

Pros: Style 100% him/her, actual simulations.

Cons: Mostly low quality images resources on old artists paintings, old texture/scans, may require clonestamp/inpainting*

A.1) Original Real Sources old paintings*

A.2) Original Real Sources modern paintings* <- Preferable

Note: in A.1 sometimes is artist ultra popular there are good inputs.

[Method B] Mixed RealGenre-style-work aka "Genre art training"

Pros: Highly quality detailed images

Cons: Artists are not biased by it's own style, by genre yes

B.1) Original Real Sources old paintings + Modern paintings (Genre art)

B.2) Original Real Sources for Modern Masterns (modern genre art)

[Method C] Semi-Pure style-work: Original Sources + AI Gen Sources

Pros: Highly quality detailed images, better control

Cons: Artists are not biased by it's own style, by a global sum

C.1) Original Real Sources Old + Ai Gen Sources (AI Derivative Style work)

C.2) Original Real Sources Modern + Ai Gen Sources (AI Derivative Style work)

[Method D] Semi-Pure Genre-style-work: Original Sources + AI Gen Sources on diff. artists

Pros: Highly quality detailed images, better control

Cons: Artists are not biased by it's own style, by a global sum

D.1) Original Real Sources (mixed=Old+Modern) + AI Gen Sources (AI Derivative Style genre art)


[Method E] Generative style works training

Pros: Highly quality detailed images, insane control, best quality outputs possible

Cons: Artists are biased by AI derivatives, careful. Not real artist representation.

E.1) Generative MJ/DE3/etc Sources. artist semi-representation


[Method F] Generative style Genre

Pros: Diversity, Highly quality detailed images, insane control, best quality outputs possible

Cons: Genre are minimal biased by AI derivatives

F.1) Generative MJ/DE3/etc Sources. genre AI representation

[Method G] "Wholelot"

Pros: Diversity, Highly quality detailed images, insane control, best quality outputs possible

Cons: Anything except careful with old textured images.

G.1) Original Real sources (mixed) + Generative art (mixed) <- Preferable


So... Depending on the category of the Dataset you will need more cleaning, refining, polishing, clone stamp, inpainting, redrawing sometimes, etc. If you want to avoid this messy task with some hours behind Photoshop... Use modern paintings (mainly cause they're not as texturized as older ones, important point), modern artists styles, good generative inputs, etc. All you can to get better coherent and understandable dataset for AI.

Crucial things: Avoid multiple figures in the scene. Avoid complex interactions, Avoid unrecognizable faces, etc. The main things AI won't understand.

Being said that, I did a training to illustrate this and how I was solving step by step problems I was having. You know the typical D&D images cover-art you see from the 90's? Well, I choose one "old" artist for this experiment, in this case Jeff Easley, and tried to see if I could get a big chunk of his style while remaining non-overfitted and flexible. And yes, hardcore D&D player in my young days.

Let's see the paintings of this artist. Wonderful! (and hard!). The main problem was/is the cover-arts and images are typical group of sorcerers, girls, warriors, dragons, etc. so I have to deal with images heavily texturized and in most cases medium quality. Should the training be avoided? yes? no?

First trainings (a run with the uncropped original pictures without any retouching):

Some good images but faces destroyed a lot of edgy and in general don't feel Easley's vibes. Trash. Next.


Second run. Curated dataset with more images, getting rid of some, etc. Basically more time saying this goes in, this out:

Ok! Not bad but still problems on grouping, edging and maybe faces. Still works better than the first one. Let's improve from this, next:

Third run. Realized that artstyle/illustration and captions weren't working as I thought, get rid of those and polished more the set, this time with upscaling images and cropping some areas of interest.



Ok ok! That's very nice. I like it. However in some I noticed a lot of off, edging (still) but it's there. Def. feeling it. Let's go further this...

Fourth run (at this point I felt like for me the out was good but still... not easley enough!). Inpainted faces, zoomed in, clonestamp unneeded areas, more cropping, upscaling and refining... Feel like I'm taking more time in Photoshop than testing prompts, lol.

IMO now was it however... the "edging", lets take an example the minotaur red, Original Size:

Umm. Something is off. Let's see more...



Yeah... Theyre cool. They're usable and even I can feel some Eisley's vibes. Time to stop? What if I go further and try to make it better?

After almost 9 trainings I go with the final training, 5th run! Inpainting on borders of the bodies, clonestamp too, inpainting faces for details (low denoising always).

I think it's almost perfect (to me). I'm pretty happy with this and at this point I changed inferencing from CFG scale 7 to 2-4. (past ones were 7).

The only thing I can say is that sometimes goes more less Easley's lining but I it's ok...

A minimal revision to just refine a bit more: to see if I can gather a match of the lining style...

Now yes. Perfect for me and totally what I wanted with enough flexibility to do anything.

End of the experiment. Enough of Easley, lol.


I will put some examples of actual inpainting, fixing, zooming+cropping, redrawing, etc. to clean the original set:





Remember that neither of the image outputs I posted here are inpainted, upscaled or detailed, are RAW outputs so faces need fixing with adetailer, etc.

So that concludes my experiment and this guide to show and illustrate that more than settings/parameters, etc. what you should focus on is the Dataset and it's preparation and refining. Sometimes a madness.


Note: This was done for SDXL. You can do it for SD 1.4/1.5 though.


Hope you enjoyed this!

34

Comments