Hello!
I did not find anywhere a good answer to this question: how can you train your own general purpose model for using it in the automatic1111's webui after?
As an example, how can you make a model like Protogen Infinity or like Dreamlike Diffusion 1.0, but without merging other models? How many images do you need for training? 100000? 1000000? More or less? Do you need to use the captioning? Is dreambooth good enough for training or is something better than this? How many steps are recommended? I heard that you have to multiply the number of images with 100 or with 200 to obtain the number of the steps (so for 30 images, you have to use 3000 or 6000 steps). What other settings are important and what values are recommended?
Thanks for understanding and for answering!
2 Answers
Hello Jalakan,
While Dreambooth can be leveraged for smaller scale training, I'd think you are looking at large scale fine tuning at that point, perhaps StableTuner and their Discord might be able to provide better direction regarding that, see https://github.com/devilismyfriend/StableTuner.
From scratch can be hard, as without a varied sample set it will Pidgeon hole itself into not knowing a tonne of words or concepts. So while you may want a model that can make 1 style only, without the base vocabulary (such as what a house is, what a girl is, what a cat is) it wouldn't be able to do much with said style. That said, I was able to train a model from scratch as a test (not on here) that was usable, just super specific, in under 20k images. not idea, but it can be done.