ghost
Sign In

How to Develop Your Style (Setup) for Stable Diffusion

How to Develop Your Style (Setup) for Stable Diffusion

The following is meant not as a beginner's guide to using Stable Diffusion but rather as a guide for creating one's own setups (combining prompts, LoRAs and embeddings). (It is a revised version of an article original posted on my DeviantArt page.)

The goal is to create setups that reliably generate good images at 512x512 (or similarly low resolution), which can then be improved further via inpainting and/or upscaling.

It's best to go through the following process separately for very different styles (realistic and anime, for example). If you want the best results, don't try to create one setup that is good at everything but separate setups for different purposes.

If you want more detailed explanations of the principles behind this approach and/or a list of resources I recommend, have a look at my other articles.

1) Check out sample images

A good way to start is to look at images you like. On Civitai, generation data is displayed for most of the images that have been posted here. Most images in .png format retain their metadata, which can be accessed in Automatic1111 (or in Notepad or other apps). This data will give you a first idea of what kinds of models/checkpoints, samplers, LoRAs and other settings people tend to use.

2) Select a model

Start with whichever model you see used for the images you like the most.

3) Select the best sampler for that model

DPM++ SDE Karras works well with every model that I have tried. Generation data may give you an indication of which sampler to use.

If you want to be thorough (and it's a good idea to do this eventually), you can use the script "X/Y/Z plot" to create a grid with models and samplers. This will give you a quick visual overview and will help you both rule out many samplers (the ones that create noisy or distorted images) as well as suggest which ones are worth trying out further.

(Note that you can use the same script to test CFG, sampling steps and other settings.)

4) Configure performance enhancers

If you're new to Stable Diffusion, you will want to just try out different prompts and perhaps load up some LoRAs or embeddings somewhat at random.

Once you are eager to improve your results on a more consistent level, try adding different LoRAs at different weights that promise to do just that, whether it be in terms of overall style, lighting, sharpness, composition or other general features. Obviously, start with strong weights to get a general idea (you can also experiment with negative weights), then make more incremental changes, always trying to tweak the result in the direction you want to go.

Check out "conceptmods" that only change existing weights without adding new images. With the Popular LoRA you can modify any model to assign a greater weight to "masterpiece" images by default without having to prompt "masterpiece" each time.

Do the same with negative embeddings: Try adding different ones one by one at different weights and use whatever combination reliably gives the best result.

5) Develop a good prompt format

Use the positive prompt to give a concise and specific description of the individual image you want to create, including everything that absolutely must be in the image (specific people, objects, backgrounds) but nothing else. Experiment with individual keywords meant to boost performance ("masterpiece, best quality etc."), but try to use very few of those (if any) so as not to crowd out these specific instructions.

If you use a consistent prompt format (rather than a wall of text with keywords in a random order), that makes it easier to tweak images. Keyword order greatly affects the result, and you may sometimes want to move an especially important keyword (closer) to the front.

Here is the type of format that you'll see a lot of people use and recommend:

[type of image], [subject], [essential details], [overall mood]

(Note that photorealistic images benefit from somewhat longer prompts that include information like camera name, lighting condition and photographer name. See PromptGeek's fantastic guide on this subject.)

First try using a plain-text prompt (a good SD setup might already have a good enough concept of what you describe). If that doesn't give you the right result, use embeddings for specifics whenever possible. Only then use LoRAs for specifics. Similarly, don't try to do with prompts what a LoRA could do better (the kind of performance enhancements I've talked about earlier).

Why bother with all of this?

The method described above takes a lot of iterations to do thoroughly, but with every bit that you improve your setup for all images (of one general style), you make it that much easier to achieve a good result for each image. Time spent on tweaking your overall setup can save you a lot of time later trying to tweak individual images. And when you do find yourself tweaking images, you'll already have a good understanding (from experience) of what component may create what effect at which weight, and you'll be more likely to know which changes to make.

14

Comments