What do each of the model types mean?
What do each of the model types mean?
Checkpoint: Shadow
Textual Inversion: negative prompt
Hypernetwork
Aesthetic Gradient
LoRA: Learned Data Characters and People
LyCORIS
Controlnet
Upscaler: high quality
VAE: color adjustment
Poses: Adjust the posture of the created character
Wildcard
Other
I only know this much.
can you tell me these?
2 Answers
Of course you can find all information in the internet, if you're not banned on every browser, but.
Checkpoint - base model, there is two versions - .ckpt file or .safetensors, second one can't contain any maliscious code. So, if model have it, it's better to download .safetensors version in any case. I've heard that this file extension don't work with some generative services, but not sure. I've used only Stable Diffusion WebUI instance.
Textual Inversion (embeddings) - usually used as negative emddings, like EasyNegative or Bad-Hands, should help with low quality images or anatomy. Actually, Bad-Hands can perform on the same level, as any model without it. For me, it's useless.
Hypernetwork - s "small" finetuned model for stable diffusion, that helps with anything it was trained on. Characters, styles, etc. Right now most people using LoRAs, so it's better to avoid hypernetworks, i think. LoRAs provides much higher quality and accuracy.
Aesthetic Gradient - i don't know. Never used it, and wouldn't.
LoRA - low-rank adaptation, the same as hypernetwork, in simple words. Can help you with different concepts like styles, characters, clothes etc. You can imagin that Stable Diffusion itsef is a game, and LoRAs are kinda cool mods for this game.
LyCORIS - the same as LoRAs, but provides more settings on training and performs better at styles. I believe.
Controlnet - it's better to check videos on this theme, really.
Upscaler - model, used to upscale your images on generation process (hires. fix) or in inpaint tab.
VAE - color adjustment (correction) model. Originally, most models doesn't have baked-in vae. So, you can use any VAE your want w/o any issues. VAEs should be put into SD folder -> models -> VAE folder and then chosen in Settings tab -> Stable Diffusion on left list -> SD VAE in your WebUI page.
Poses - used for ControlNet.
Wildcard - a list of randon tags, that can be used for random things on your generation. For example, you have "dress" file, and it contains different dresses, described in several tags. And after you've put this filename in your prompt, if would randomise all tags inside this Wildcard file. It's better for you to check guides as well, if you want to use it.
Other - for other files, obviously.
Great questions, added to the new users guide
Checkpoint: The largest file and determines what you could possibly generate. Some models generate more anime and some models generate more photorealistic images, for example.
Textual Inversion: A textual inversion is a file placed in `stable-diffusion-webui\embeddings` which changes the meaning of a token to have different weights. Simply put, a "token" is a word or group of words, and the "weights" are what shapes are generated by that token.
negative prompt: A secondary prompt window that tells the model what to avoid. Often, people put ugly things, or aspects of the image they don't want. If you keep randomly generating characters in hats even though it's not in the prompt, put hat
in the negative prompt.
Hypernetwork: Without getting too deep into the technical details, this is similar to an early version of LoRA. They aren't very popular, most people find LoRA and Lycoris easier to use.
Aesthetic Gradient
LoRA: Learned Data Characters and People. It's worth noting that what checkpoint the LoRA is trained against can have a significant impact on how compatible it will be across different checkpoints. Many users recommend training against SD1.5 (at the time of this writing, SDXL was not widely available) to ensure compatibility with the widest range of checkpoints.
LyCORIS: Similar to LoRA, may require installing an additional software extension to use depending on your interface. Training parameters are different than LoRA and can function better for some stylistic choices.
Controlnet: See my guides on using canny controlnets, the most famous controlnet used with img2img. Another popular controlnet is openpose, which allows you to manipulate a skeleton to pose your generated image
Upscaler: high quality. I recommend the 4x foolhardy upscaler (place in stable-diffusion-webui\models\ESRGAN
) and the following settings are a good starting point:
Sampler: DPM++ 2M SDE Karras
Steps: 40
Dimension: 512x768
Restore faces, highres fix (turn off restore faces for anime)
25 high res steps
0.4 denoising. Increase this for more detail, or decrease this if there are too many errors
VAE: color adjustment. Usually vae-ft-mse-840000-ema-pruned
unless the model recommends otherwise
Poses: Adjust the posture of the created character with a controlnet. The guide is here: https://civitai.com/articles/157/openpose-controlnets-v11-using-poses-and-generating-new-ones
Wildcard: First select Prompt matrix
in the Script
dropdown. Separate multiple tokens using the |
character, and the system will produce an image for every combination of them. For example, if you use photograph a woman with long hair|curly hair|dyed hair
as the prompt, four images are generated:
photograph a woman with long hair
photograph a woman with long hair, curly hair
photograph a woman with long hair, dyed hair
photograph a woman with long hair, curly hair, dyed hair