The translation is done with Gemini Experimental 1206.
All links in the article have been replaced by placeholders, original links are available in the original article.
1. Introduction
This document aims to provide a comprehensive and up-to-date introduction to the NoobAI-XL model.
Please note that due to the dynamic nature of information and the difficulty of maintenance, this document may contain errors or omissions.
1.1 Basic Introduction
NoobAI-XL is a text-to-image diffusion model developed by Laxhar Dream Lab and sponsored by BlueRun. The model license inherits from fair-ai-public-license-1.0-sd and includes the additional restrictions of the NoobAI-XL model license. The model is based on the SDXL architecture and uses Illustrious-xl-early-release-v0 as its base model. It has been trained for a large number of epochs on the complete Danbooru and e621 datasets (approximately 13,000,000 images in total), resulting in extensive knowledge and excellent performance.
1.2 Overview
NoobAI-XL possesses a vast amount of knowledge, capable of reproducing tens of thousands of anime characters and artist styles, recognizing a large number of special concepts in the anime domain, and having extensive knowledge of furry.
NoobAI-XL offers two versions: noise prediction and V-prediction. In short, the noise prediction version generates more diverse and creative images, while the V-prediction version adheres more closely to the prompts, producing images with a wider color gamut and stronger light and shadow effects.
NoobAI-XL has a growing ecosystem of community support, including various LoRAs, ControlNets, IP-Adapters, and more.
NoobAI-XL includes a series of models, mainly noise prediction and V-prediction, which will be described in detail later.
2. Quick Start
Before reading this section, readers should already be familiar with the basic usage of any WebUI, ComfyUI, forge, or reForge, etc. Otherwise, readers need to learn from here or from the internet (such as Bilibili, etc.) on their own.
2.1 Model Download
Model Download Sites
CivitAI: Click here (Note: May require VPN)
LiblibAI: Click here
Huggingface: Click here (Note: May require VPN)If you are unsure which model to download, you can browse here.
2.2 Model Loading
NoobAI-XL models are divided into two categories: noise prediction (epsilon prediction, or eps-pred for short) models and V-prediction (v-prediction, or v-pred for short) models. Models with "eps", "epsilon-pred", or "eps-pred" in their names are noise prediction models, which are not significantly different from other models. If you use them, you can skip this section. Models with "v" or "v-pred" in their names are V-prediction models, which are different from most conventional models. Please read the installation guide in this section carefully! For an introduction to the principles of V-prediction models, please refer to this article.
2.2.1 Loading V-prediction Models
V-prediction is a relatively rare model training technique, and models trained using this technique are called V-prediction models. Compared to noise prediction, V-prediction models are known for their higher prompt adherence, wider color gamut, and stronger light and shadow effects. Examples include NovelAI Diffusion V3 and COSXL. Due to their late emergence and the scarcity of such models, some mainstream image generation projects and UIs do not directly support them. Therefore, if you intend to use V-prediction models, some additional operations are required. This section will introduce their specific usage. If you encounter any difficulties during use, you can also directly contact any of the model authors for assistance.
a. Using in forge or reForge
forge and reForge are two image generation UIs developed by lllyasviel and Panchovix, respectively, both of which are extended versions of WebUI. Their main branches support V-prediction models, and their operation mode is almost the same as WebUI, so they are recommended. If you have already installed it, you only need to run git pull in its installation directory to update and then restart. If you have not installed it, you can refer to online tutorials for installation and use.
b. Using in ComfyUI
ComfyUI is an image generation UI developed by comfyanonymous, allowing users to freely manipulate nodes, known for its flexibility and professionalism. Using V-prediction models in it only requires adding additional nodes.
c. Using in WebUI
WebUI refers to the project stable-diffusion-webui developed by AUTOMATIC1111. Currently, the main branch of WebUI, the main branch, does not support V-prediction models. You need to switch the branch to dev. Please note that this method is unstable and may have bugs. Improper use may even cause irreversible damage to WebUI. Therefore, please back up your WebUI in advance. The specific method is as follows:
If you have not installed WebUI, please refer to online tutorials to install it;
Open the console or terminal in your stable-diffusion-webui installation directory;
Enter the command git checkout dev and press Enter;
Restart WebUI.
d. Using in Diffusers
Diffusers is a Python library dedicated to diffusion models. This usage requires users to have a certain coding foundation and is recommended for developers and researchers. Code example:
import torch
from diffusers import StableDiffusionXLPipeline, EulerDiscreteScheduler
ckpt_path = "/path/to/model.safetensors"
pipe = StableDiffusionXLPipeline.from_single_file(
ckpt_path,
use_safetensors=True,
torch_dtype=torch.float16,
)
scheduler_args = {"prediction_type": "v_prediction", "rescale_betas_zero_snr": True}
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config, **scheduler_args)
pipe.enable_xformers_memory_efficient_attention()
pipe = pipe.to("cuda")
prompt = """masterpiece, best quality, john_kafka, nixeu, quasarcake, chromatic aberration, film grain, horror \(theme\), limited palette, x-shaped pupils, high contrast, color contrast, cold colors, arlecchino \(genshin impact\), black theme, gritty, graphite \(medium\)"""
negative_prompt = "nsfw, worst quality, old, early, low quality, lowres, signature, username, logo, bad hands, mutated hands, mammal, anthro, furry, ambiguous form, feral, semi-anthro"
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=832,
height=1216,
num_inference_steps=28,
guidance_scale=5,
generator=torch.Generator().manual_seed(42),
).images[0]
image.save("output.png")
2.3 Model Usage
2.3.1 Prompts
NoobAI-XL has no strict requirements for prompts, and the recommended operations in this article are just icing on the cake.
NoobAI-XL recommends users to use tags as prompts to add desired content. Each tag is an English word or phrase, and tags are separated by a comma ", ". Tags directly from Danbooru and e621 have stronger effects. To further achieve better results, you can refer to the prompt specifications below.
We recommend always adding the aesthetic tag "very awa" and the quality tag "masterpiece" to the prompt.
NoobAI-XL supports generating highly accurate characters and artist styles, both triggered by tags, which we call "trigger words". For characters, the trigger word is their character name; for artist styles, the trigger word is the artist's name. The complete trigger word table can be downloaded from noob-wiki. A detailed introduction to trigger words can be found below.
Similar to NovelAI, NoobAI-XL supports special tags such as quality, aesthetics, creation year, creation period, and safety rating, which are used as auxiliary tools. Interested readers can find them in the detailed introduction below.
2.3.2 Generation Parameters
a. Basic Parameters
The table below recommends three generation parameters: Sampler, Sampling Steps, and CFG Scale. Bold indicates strongly recommended; bold red indicates mandatory requirements, and using other parameter values will bring unexpected results.
Recommended Generation Parameters
All noise prediction versions:
Sampler: Euler A
cfg: 5~7
Sampling Steps: 28~35
V-prediction 0.9r version:
Sampler: Euler
cfg: 3.5~5.5
Sampling Steps: 32~40
Sampler: Euler A
cfg: 3~4
Sampling Steps: 38~40
V-prediction 0.75s version:
Sampler: Euler A
cfg: 3~4
Sampling Steps: 38~40
V-prediction 0.65s version:
Sampler: Euler A
cfg: 3~5
Sampling Steps: 28~40
V-prediction 0.6 version:
Sampler: Euler A or Euler
cfg: 3.5~5.5
Sampling Steps: 32~40
Sampler: Euler A
cfg: 5~7
Sampling Steps: 28~35
V-prediction 0.5 version:
Sampler: Euler
cfg: 3.5~5.5
Sampling Steps: 28~35
V-prediction test version:
Sampler: Euler A
cfg: 5~7
Sampling Steps: 28~35
b. V-prediction Model Precautions
For V-prediction models, it is recommended to use the following parameters to (i) optimize color, light and shadow, and details; (ii) eliminate the effects of oversaturation and overexposure; (iii) enhance semantic understanding.
Any optimizer with Rescale CFG (around 0.7) parameter. Some image generation UIs do not support it.
Or, Euler Ancestor CFG++ sampler, and set CFG Scale between 1 and 1.8. Some image generation UIs do not support it.
Due to compatibility issues, some samplers may cause V-prediction models to generate images with excessive saturation or fragmented lines, such as the DPM series samplers.
c. Resolution
The resolution (width x height) of the generated image is an important parameter. Generally speaking, due to architectural reasons, all SDXL models, including NoobAI-XL, need to use specific resolutions to achieve the best results. Not a single pixel more or less, otherwise the quality of the generated image will be weakened. The recommended resolutions for NoobAI-XL are as follows:
Recommended Resolutions
Resolution (width x height): 768x1344, Ratio: 9:16
Resolution (width x height): 832x1216, Ratio: 2:3
Resolution (width x height): 896x1152, Ratio: 3:4
Resolution (width x height): 1024x1024, Ratio: 1:1
Resolution (width x height): 1152x896, Ratio: 4:3
Resolution (width x height): 1216x832, Ratio: 3:2
Resolution (width x height): 1344x768, Ratio: 16:9
You can also use larger resolutions, although this is not stable. (According to research on SD3, when the generated area is increased by a factor of k, the model's uncertainty will increase by a factor of k^2.) We recommend that the area of the generated image should not exceed 1.5 times the original area. For example, 1024x1536.
2.3.3 Other Precautions
V-prediction models are more sensitive to prompts and generation parameters;
CLIP skip does not apply to all SDXL architecture models, so there is no need to set it;
The model does not need to use any other VAE models;
2.4 Other Resources
Created by 年糕特工队, NOOBAI XL Quick Guide provides a beginner's tutorial for NoobAI-XL, recommended for beginners. Its English version is NOOBAI XL Quick Guide.
Produced by 风吟, "A video to teach you how to use V-prediction models, NoobAI tutorial for beginners" provides a video tutorial on deploying V-prediction models.
3.1 Model Overview
3.1.1 Base Models
NoobAI-XL includes a series of base models with different versions. The table below summarizes the characteristics of each version.
Base Model Versions
Version Number: Early-Access
Prediction Type: Noise Prediction
Download Address: CivitAI, Huggingface
Iterated from: Illustrious-xl-early-release-v0
Version Characteristics: The earliest version, but already has sufficient training.
Version Number: Epsilon-pred 0.5
Prediction Type: Noise Prediction
Download Address: CivitAI, Huggingface
Iterated from: Early-Access
Version Characteristics: (Recommended) The most stable version, the only drawback is insufficient knowledge of niche concepts.
Version Number: Epsilon-pred 0.6
Prediction Type: Noise Prediction
Download Address: Huggingface
Iterated from: Early-Access 0.5
Version Characteristics: (Recommended) The last version trained purely on UNet, with excellent convergence. The test group called it "178000" and it was liked by many.
Version Number: Epsilon-pred 0.75
Prediction Type: Noise Prediction
Download Address: CivitAI, Huggingface
Iterated from: Epsilon-pred 0.6
Version Characteristics: Trained the text encoder (tte) to learn more niche knowledge, but with some degradation in quality.
Version Number: Epsilon-pred 0.77
Prediction Type: Noise Prediction
Download Address: Huggingface
Iterated from: Epsilon-pred 0.75
Version Characteristics: Trained two more epochs on the basis of Epsilon-pred 0.75 to improve the performance degradation.
Version Number: Epsilon-pred 1.0
Prediction Type: Noise Prediction
Download Address: CivitAI, Huggingface
Iterated from: Epsilon-pred 0.77
Version Characteristics: (Recommended) Additionally trained 10 epochs to consolidate the new knowledge of tte, balanced performance.
Version Number: V-pred test
Prediction Type: V-prediction
Download Address: CivitAI, Huggingface
Iterated from: Epsilon-pred 0.5
Version Characteristics: (Not recommended) The initial experimental version of V-prediction.
Version Number: V-pred 0.5
Prediction Type: V-prediction
Download Address: CivitAI, Huggingface
Iterated from: Epsilon-pred 1.0
Version Characteristics: Has the problem of oversaturation.
Version Number: V-pred 0.6
Prediction Type: V-prediction
Download Address: CivitAI, Huggingface
Iterated from: V-pred 0.5
Version Characteristics: The saturation problem is somewhat alleviated. Based on preliminary evaluation results, V-pred 0.6 performs exceptionally well in terms of rare knowledge coverage, reaching the highest level among currently released models. At the same time, this model significantly improves the quality degradation problem.
Version Number: V-pred 0.65
Prediction Type: V-prediction
Download Address: Huggingface
Iterated from: V-pred 0.6
Version Characteristics: Has the problem of oversaturation.
Version Number: V-pred 0.65s
Prediction Type: V-prediction
Download Address: CivitAI, Huggingface
Iterated from: V-pred 0.6
Version Characteristics: The saturation problem is almost solved!
Version Number: Epsilon-pred 1.1
Prediction Type: Noise Prediction
Download Address: CivitAI, Huggingface
Iterated from: Epsilon-pred 1.0
Version Characteristics: (Recommended) Solved the average brightness problem, with improvements in all aspects.
Version Number: V-pred 0.75
Prediction Type: V-prediction
Download Address: Huggingface
Iterated from: V-pred 0.65
Version Characteristics: Has the problem of oversaturation.
Version Number: V-pred 0.75s
Prediction Type: V-prediction
Download Address: CivitAI, Huggingface
Iterated from: V-pred 0.65
Version Characteristics: (Recommended) Solves the problems of saturation, noise, and graininess in extreme cases;
3.1.2 Extended Models: ControlNet
ControlNet Models
Prediction Type: Noise Prediction
ControlNet Type: hed soft edge
Link: CivitAI, Huggingface
Preprocessor Type: softedge_hed
Prediction Type: Noise Prediction
ControlNet Type: anime lineart
Link: CivitAI, Huggingface
Preprocessor Type: lineart_anime
Prediction Type: Noise Prediction
ControlNet Type: midas normal map
Link: CivitAI, Huggingface
Preprocessor Type: normal_midas
Prediction Type: Noise Prediction
ControlNet Type: midas depth map
Link: CivitAI, Huggingface
Preprocessor Type: depth_midas
Prediction Type: Noise Prediction
ControlNet Type: canny contour
Link: CivitAI, Huggingface
Preprocessor Type: canny
Prediction Type: Noise Prediction
ControlNet Type: openpose human pose
Link: CivitAI, Huggingface
Preprocessor Type: openpose
Prediction Type: Noise Prediction
ControlNet Type: manga line
Link: CivitAI, Huggingface
Preprocessor Type: manga_line / lineart_anime / lineart_realistic
Prediction Type: Noise Prediction
ControlNet Type: realistic lineart
Link: CivitAI, Huggingface
Preprocessor Type: lineart_realistic
Prediction Type: Noise Prediction
ControlNet Type: midas depth map
Link: CivitAI, Huggingface
Preprocessor Type: depth_midas
Note: New version
Prediction Type: Noise Prediction
ControlNet Type: hed scribble
Link: CivitAI, Huggingface
Preprocessor Type: scribble_hed
Prediction Type: Noise Prediction
ControlNet Type: pidinet scribble
Link: CivitAI, Huggingface
Preprocessor Type: scribble_pidinet
Note that when using ControlNet, you must match the type of preprocessor you are using with the type of preprocessor required by ControlNet. In addition, you may not need to match the prediction type of the base model with the prediction type of ControlNet.
3.1.3 Extended Models: IP-Adapter
IP-Adapter (IPA) has been released on Huggingface and CivitAI.
3.1.4 LoRA Models
Most LoRAs trained on the NoobAI-XL noise prediction version can be used on both noise prediction and V-prediction versions, and vice versa.
3.2 Prompt Guide
First, we need to clarify that the role of prompts is to guide. Good prompts unleash the potential of the model, but bad or even wrong prompts do not necessarily make the results worse. Different models have different optimal prompt usages. The effects of misuse are often not obvious, and in a few cases, they may even be better. This prompt guide records the theoretically optimal prompt writing method for the model, and capable readers can also use their own creativity.
This section will provide a detailed prompt writing guide, including prompt writing specifications, specific usage of character and style trigger words, usage of special tags, and more. Readers interested in prompt engineering can choose to read selectively.
3.2.1 Prompt Specifications
NoobAI-XL has the same prompt specifications as other anime models. This section will systematically introduce the basic writing specifications of prompts and help readers avoid common prompt writing misconceptions in the community.
According to the different formats, prompts can be roughly divided into two categories: tags and natural language. The former is mostly used for anime models, and the latter is mostly used for realistic models. Regardless of the type of prompt, unless the model specifically states otherwise, it should only contain English letters, numbers, and English symbols. Note that the Chinese comma "," cannot be used instead of the English comma ",", as they are not equivalent.
Tag prompts consist of lowercase English words or phrases separated by commas ", ", for example, "1girl, solo, blue hair" contains three tags, "1girl", "solo", and "blue hair".
Extra spaces, line breaks, etc. in the prompt will not affect the actual generation effect. In other words, "1girl, solo, blue hair" and "1girl,solo,blue hair" have exactly the same effect.
Prompts should not contain any underscores "_". Influenced by websites such as Danbooru, the usage of underscores "_" instead of spaces " " between words in tags has spread, which is actually a misuse and will cause the generated results to be different from using spaces. Most models, including NoobAI-XL, do not recommend including any underscores in prompts. Such misuse can range from affecting the quality of generation to making trigger words partially or even completely ineffective.
Escape parentheses when necessary. Parentheses, including round brackets, square brackets, and curly brackets, are very special symbols in prompts. Unlike general symbols, in most image generation software and UIs, parentheses will be interpreted as weighting specific content, and the parentheses involved in weighting will not have their original meaning. However, if the original prompt needs to contain parentheses, such as some trigger words, what should be done? The answer is to add a backslash "" before the parentheses to eliminate the weighting function of the parentheses. This operation of changing the original meaning of a character is called escaping, and the backslash is also called an escape character. For example, if a backslash is not used for escaping, the prompt "1girl, ganyu (genshin impact)" will be incorrectly interpreted as "1girl, ganyu genshin impact", where "genshin impact" is weighted, and the parentheses disappear. By adding an escape character, the prompt becomes "1girl, ganyu (genshin impact)", which is as expected.
In short, tag normalization is divided into two steps: (i) replace the underscores in the tag with spaces, and (ii) add a backslash "" before parentheses.
Tags directly from Danbooru and e621 have stronger performance. Therefore, instead of creating tags yourself, we recommend readers to directly search for tags on these two websites. It should be noted that the tags obtained directly in this way are separated by underscores "_" between words, and the parentheses are not escaped. Therefore, before adding tags from them to the prompt, you need to remove the spaces in the tags and escape the parentheses. For example, convert the tag "ganyu_(genshin_impact)" from Danbooru to "ganyu (genshin impact)" before using it.
Do not use invalid meta tags. Meta tags are a special category of tags on Danbooru, used to indicate the characteristics of the image file or the work itself. For example, "highres" indicates that the image has a high resolution, and "oil_painting_(medium)" indicates that the image is in the style of an oil painting. However, not all meta tags are related to the content or form of the image. For example, "commentary_request" indicates that the Danbooru post has a translation request for the work, which has no direct relationship with the work itself, so it has no effect.
Ordered prompts are better. NoobAI-XL recommends writing prompts in a logical order, from primary to secondary. A possible writing order is as follows, please use it as a reference only:
<1girl/1boy/1other/female/male/...>, <character>, <series>, <artist(s)>, <general tags>, <other tags>, <quality tags>
Among them, <quality tags> can be placed at the beginning.
Natural language prompts are composed of sentences, each sentence starting with a capital letter and ending with a period ".". Most anime models, including NoobAI-XL, have a better understanding of tags, so natural language is often used as an auxiliary rather than a primary component in prompts.
3.2.2 Character and Artist Styles
NoobAI-XL supports the direct generation of a large number of fan-made anime characters and artist styles. Both characters and styles are triggered by their names, and such names are also tags, called trigger words. You can directly search on Danbooru or e621, and use the obtained tags after normalization as prompts.
3.2.2.1 Usage
There are some differences in how characters and artists are triggered.
For artist styles, just add the artist's name to the prompt without any prefixes, suffixes, or additional modifiers, neither "by xxx" nor "artist:xxx", just "xxx".
For characters, use the format "character name + series". That is, in addition to adding the character name, you also need to add a series tag immediately after the character trigger word tag to indicate which work the character is from. If a character has multiple series tags, adding one or more of them is acceptable. Please note that if the character name already contains the series name, you still need to add the series tag, without considering the repetition issue. Usually, "character name + series name" is sufficient to restore the character. For example, the trigger word for the character "ganyu_(genshin_impact)" from the series "genshin_impact" is "ganyu (genshin impact), genshin impact". Similarly, character trigger words do not need to add any prefixes, suffixes, or additional modifiers.
The table below shows some correct and incorrect examples of character and style triggering:
Character and Artist Style Trigger Examples
Type: Character
Prompt: Rei Ayanami
Correct/Incorrect: Incorrect
Reason: 1. The character name should be "ayanami rei". 2. The series tag "neon genesis evangelion" is not added.
Type: Character
Prompt: character:ganyu (genshin impact), genshin impact
Correct/Incorrect: Incorrect
Reason: Unnecessarily added the prefix "character:".
Type: Character
Prompt: ganyu_(genshin impact)
Correct/Incorrect: Incorrect
Reason: 1. The tag is not fully normalized: it should not contain underscores. 2. The series tag is not added.
Type: Character
Prompt: ganyu (genshin impact), genshin impact
Correct/Incorrect: Incorrect
Reason: The tag is not fully normalized: the parentheses are not escaped.
Type: Character
Prompt: ganyu (genshin impact), genshin impact
Correct/Incorrect: Incorrect
Reason: 1. The tag is not fully normalized: the left parenthesis is not escaped.
Type: Character
Prompt: ganyu (genshin impact),genshin impact
Correct/Incorrect: Incorrect
Reason: Used a Chinese comma to separate the two tags.
Type: Character
Prompt: ganyu (genshin impact), genshin impact
Correct/Incorrect: Correct
Type: Artist Style
Prompt: by wlop
Correct/Incorrect: Incorrect
Reason: Unnecessarily added the prefix "by ".
Type: Artist Style
Prompt: artist:wlop
Correct/Incorrect: Incorrect
Reason: Unnecessarily added the prefix "artist:".
Type: Artist Style
Prompt: dino
Correct/Incorrect: Incorrect
Reason: The artist's name is wrong. The artist's name of aidxl/artiwaifu should not be used, but Danbooru should be followed, so it should be "dino (dinoartforame)".
Type: Artist Style
Prompt: wlop
Correct/Incorrect: Correct
3.2.2.2 Trigger Word Encyclopedia
For convenience, we also provide complete trigger word tables in noob-wiki for your reference:
Trigger Word Table Information
Danbooru Characters: Click here
Danbooru Artist Styles: Click here
e621 Characters: Click here
e621 Artist Styles: Click here
Trigger Word Table Column Explanations
Column Name: character
Meaning: The tag name of the character on the corresponding website.
Column Name: artist
Meaning: The tag name of the artist style on the corresponding website.
Column Name: trigger
Meaning: The normalized trigger word.
Note: Copy and paste it into the prompt as is.
Column Name: count
Meaning: The number of images with this tag. Access requires a VPN.
Note: As an expectation of the accuracy of this concept. For characters, a count greater than 200 can restore well. For styles, a count greater than 100 can restore well.
Column Name: url
Meaning: The tag page on the original website.
Note: Requires a VPN.
Column Name: solo_count
Meaning: In the dataset, the number of images with this tag and only one character in the image.
Note: Only character tables. For characters, a solo_count greater than 50 can restore well. When judging the accuracy by count, the deviation of the count column is large and the accuracy is low, while solo_count is a more accurate indicator.
Column Name: core_tags
Meaning: The core feature tags of the character, including appearance, gender, and clothing. Separated by English commas, each tag is normalized.
Note: Only Danbooru character tables. When triggering niche characters and their accuracy is insufficient, you can add several core feature tags to enhance the accuracy.
3.2.3 Special Tags
Special tags are a type of tag with specific meanings and effects, serving an auxiliary role.
3.2.3.1 Quality Tags
Popularity tags derived from Danbooru and e621 user preference statistics. In descending order of quality, they are:
masterpiece > best quality > high quality / good quality > normal quality > low quality / bad quality > worst quality
3.2.3.2 Aesthetic Tags
Aesthetic tags obtained by scoring with an aesthetic scoring model. So far, there are only two, "very awa" and "worst aesthetic". The former represents the top 5% of data weighted by waifu-scorer-v3 and waifu-scorer-v4-beta, and the latter represents the bottom 5% of data. It is named "very awa" because its aesthetic standard is similar to the ArtiWaifu Diffusion model. In addition, an aesthetic tag that is still in training and has an insignificant effect is "very as2", which represents the top 5% of data scored by "aesthetic-shadow-v2-5".
Aesthetic Tag Comparison
Comparison of the effects of quality and aesthetic tags. The image was generated by the v-pred-0.65s version.
Quality tags reflect the popularity of the image, and aesthetic tags reflect the aesthetics of a specific image scoring model.
The "very awa" tag helps to enhance the artistic aesthetics of the image while eliminating the "AI feeling";
"very as2" is superior in training but not yet fully developed, so the effect is not obvious.
3.2.3.3 Safety Rating Tags
There are four safety rating tags: general, sensitive, nsfw, and explicit.
Users are expected to consciously add "nsfw" to negative prompts to filter inappropriate content.
3.2.3.4 Year and Period Tags
Year tags are used to indicate the year of creation of the work, indirectly affecting the quality, style, character accuracy, etc. The format is "year xxxx", where "xxxx" is the specific year, such as "year 2024".
Period tags are a range of year tags, which also have a great impact on image quality. The correspondence between tags and years is shown in the table below:
Year and Period Tag Correspondence
Year Range: 2021~2024, Period Tag: newest
Year Range: 2018~2020, Period Tag: recent
Year Range: 2014~2017, Period Tag: mid
Year Range: 2011~2013, Period Tag: early
Year Range: 2005~2010, Period Tag: old
3.2.4 Other Tips
This section provides examples of recommended usage of prompts, for reference only.
3.2.4.1 Quality Prompts
The following recommended starting formula uses special tags, which are the most relevant tags related to image quality:
masterpiece, best quality, very awa
3.2.4.2 Negative Prompts
The table below introduces common negative prompt tags and their sources. Not all negative tags are necessarily bad, and proper use can have unexpected effects.
Tag (tag): worst aesthetic
Translation: Worst aesthetics
Note: Includes low-quality, watermarked, manga, multiple views, unfinished sketches, and other low-aesthetic concepts
Source: Aesthetic Tags
Tag (tag): worst quality
Translation: Worst quality
Source: Quality Tags
Tag (tag): low quality
Translation: Low quality
Note: Danbooru's low quality
Source: Quality Tags
Tag (tag): bad quality
Translation: Bad quality
Note: e621's low quality
Source: Quality Tags
Tag (tag): lowres
Translation: Low resolution
Source: Danbooru
Tag (tag): scan artifacts
Translation: Scan artifacts
Source: Danbooru
Tag (tag): jpeg artifacts
Translation: JPEG image compression artifacts
Source: Danbooru
Tag (tag): lossy-lossless
Translation: -
Note: Images that have been converted from a lossy image format to a lossless image format, often full of artifacts.
Source: Danbooru
Tag (tag): ai-generated
Translation: AI-generated
Note: Generated by AI, often has the greasy feeling of AI generation.
Source: Danbooru
Tag (tag): abstract
Translation: Abstract
Note: Eliminates messy lines
Source: Danbooru
Tag (tag): official art
Translation: Official art
Note: Illustrations produced by the official company/artist of the series or character. The image may have copyright, company, or artist names printed somewhere, as well as a copyright statement.
Source: Danbooru
Tag (tag): old
Translation: Early image
Source: Period Tags
Tag (tag): 4koma
Translation: 4-panel manga
Source: Danbooru
Tag (tag): multiple views
Translation: Multiple views
Source: Danbooru
Tag (tag): reference sheet
Translation: Character design sheet
Source: Danbooru
Tag (tag): dakimakura (medium)
Translation: Body pillow image
Source: Danbooru
Tag (tag): turnaround
Translation: Full-body turnaround
Source: Danbooru
Tag (tag): comic
Translation: Comic
Source: Danbooru
Tag (tag): greyscale
Translation: Greyscale
Note: Black and white image
Source: Danbooru
Tag (tag): monochrome
Translation: Monochrome
Note: Black and white image
Source: Danbooru
Tag (tag): sketch
Translation: Sketch
Source: Danbooru
Tag (tag): unfinished
Translation: Unfinished work
Source: Danbooru
Tag (tag): furry
Translation: Furry
Source: e621
Tag (tag): anthro
Translation: Anthropomorphic furry
Source: e621
Tag (tag): feral
Translation: Feral
Source: e621
Tag (tag): semi-anthro
Translation: Semi-anthropomorphic furry
Note: When added, it seems to make the image color yellowish
Source: e621
Tag (tag): mammal
Translation: Mammal (furry)
Source: e621
Tag (tag): watermark
Translation: Watermark
Source: Danbooru
Tag (tag): logo
Translation: Logo
Source: Danbooru
Tag (tag): signature
Translation: Artist signature
Source: Danbooru
Tag (tag): text
Translation: Text
Source: Danbooru
Tag (tag): artist name
Translation: Artist name
Source: Danbooru
Tag (tag): dated
Translation: Date
Source: Danbooru
Tag (tag): username
Translation: Username
Source: Danbooru
Tag (tag): web address
Translation: Website address
Source: Danbooru
Tag (tag): bad hands
Translation: Bad hands
Source: Danbooru
Tag (tag): bad feet
Translation: Bad feet
Source: Danbooru
Tag (tag): extra digits
Translation: Extra fingers
Source: Danbooru
Tag (tag): fewer digits
Translation: Fewer fingers
Source: Danbooru
Tag (tag): extra arms
Translation: Extra arms
Source: Danbooru
Tag (tag): extra faces
Translation: Extra faces
Source: Danbooru
Tag (tag): multiple heads
Translation: Multiple heads
Source: Danbooru
Tag (tag): missing limb
Translation: Missing limb
Source: Danbooru
Tag (tag): amputee
Translation: Amputee
Source: Danbooru
Tag (tag): severed limb
Translation: Severed limb
Source: Danbooru
Tag (tag): mutated hands
Translation: Mutated hands
Source: -
Tag (tag): distorted anatomy
Translation: Distorted anatomy
Source: -
Tag (tag): nsfw
Translation: Not Safe For Work
Source: Safety Rating Tags
Tag (tag): explicit
Translation: Explicit
Source: Safety Rating Tags
Tag (tag): censored
Translation: Censored
Source: Danbooru
3.2.4.3 Tag Misuse
Commonly Misused Tags
Tag (tag): bad id
Translation: Corrupted image ID
Note: Related to image metadata, not image content
Source: Danbooru
Tag (tag): bad link
Translation: Corrupted image link
Note: Related to image metadata, not image content
Source: Danbooru
Tag (tag): duplicate
Translation: Duplicate image on the website
Note: Related to quality to some extent, but not content duplication
Source: Danbooru