Sign In

Upd2.3 SD1.5+PonyXL🦄 by "LSDM"(Layer Structural Defragmentation Methodology)

Upd2.3 SD1.5+PonyXL🦄 by "LSDM"(Layer Structural Defragmentation Methodology)

Update 2.3: Dear friends, the purpose of this update was to test “LSDM” for the PonyXL🦄 derivative(Everclear PNY by Zovya), which brought quite interesting results. :) The third update will probably appear in 1-2 months (I'm still busy with studies), where I will focus my research activities on PonyXL🦄 :)

Introduction

In this article, I propose to consider a prompt expression as a multifactorial equation consisting of overlapping sets, subsets and local variables that define the associative matrix of AI. I will try to reveal the logical, causal and categorical connections of "prompts" in order to increase the accuracy of predictive power in relation to the behavior of specific variables and their values.

I hope that after reading this material, you will learn the following:

  1. Achieving the desired result in fewer iterations.

  2. A method for determining the correlates of a particular abstraction.

  3. Create logical subsets to improve implementation efficiency.

  4. Achieve the desired result by changing the minimum number of variables.

Before continuing, a few words about me. I'm a programmer, SD is just my hobby. In essence, the content within this blog is essentially a secondary by-product of my research activities. Initially, I was faced with a dilemma on which of the topics to test my hypothetical ideas - “mammals” (cheetahs, rhinoceroses, whales) or “humans” (as an example of a set of topologies and physical properties)? I chose the latter category, assuming that within blog "bubinator", I can realize certain humanistic goals + it's fun. Two months later, I am satisfied that my assumption was justified. Among the subscribers there are many elderly and lonely people and I am glad that AI can improve the quality of their lives. :) At the same time, I am against the sexual objectification of living people and therefore I believe that AI allows humans to sublimate their needs into socially acceptable activities, thereby reducing the level of frustration, oppression, exploitation, suffering and violence in the population: “make tits, don't sex war” :) is ( • )( • )

To ensure that the above topic does not sound like cheap populism, I will cite a number of scientific publications that reveal an inverse correlation between the increased availability of pornography and a decrease in sex crimes. Thus, the availability of creating NSFW content on CivitAI has a huge humanistic benefit and if we think about it, society should be grateful to this platform for increasing the overall level of safety in the population:

https://web.archive.org/web/20070206211030/http://www2.hu-berlin.de/sexology/BIB/DIAM/effects_pornography.htm

https://www.sciencedirect.com/science/article/abs/pii/S1359178909000445

https://www.yapaka.be/sites/yapaka.be/files/actualite/pornography-rape-and-the-internet.pdf

https://www.sciencedaily.com/releases/2010/11/101130111326.htm

<br/>

1. Prompt as an abstraction for humans and syncretism for AI

By definition, abstraction is the separation of a specific property from all others. Let's say the categories “fruit” and “pixel photo” are two different abstractions for a person. But for AI, the difference may not be so obvious due to the training of the neural network through associative connections, which may differ between humans and AI. For example, due to the fact that the selection of images for fruits could contain a “pixelation” defect, which a person could understand during training, but not an AI.

The reason is that so far only people have a psyche, which, by definition, is a reflection of reality. At the same time, a person has the ability to self-awareness, i.e. is able to analyze his own mental products (for example, perception) in relation to adequacy (repeatability, verifiability), correcting errors (but not always, see “cognitive biases”).

Let's look at an example:

Let's say we need to change the value of the variable "y", where we are mistakenly convinced that:

 x = 3y

However, unbeknownst to us, when the value of "x = 2" changes, the value of other array variables may also change:

2x = A Σ[.., 6y, w/2, g/4, ..]

It turns out that to achieve the desired result, we need to change the values not only of “x” but also of “w” and “g”:

  x = 2,
  w = 2, where 0.5 * 2 = 1
  g = 4, where 0.25 * 4 = 1

However, we are deprived of the ability to read the array. How can we find out how the values of "w", "g" relate to "x"? The answer is in the next chapter.

<br/>

<br/>

2. In a multivariate system only one variable can be found at a time

How, we can now understand "w", "g" are variables of unknown prompts for us (abstracting those is not always an easy task). At the same time, we cannot know whether changing the value of “w” will entail a change in additional values in the array, for example: [.., 6t , 2c , ..], the same applies to “g”. But even if we exclude arrays, when changing the values of two variables, it will be impossible to understand which of them influenced the result. However, in reality, we are still talking about arrays; is it possible to predict how the variables relate to each other?

<br/>

<br/>

3. Categorical syllogism as the basis of methodology

Fortunately, AI models are trained by people, so variables can be grouped within categorical syllogisms, for example using a Venn diagram. This approach provides three advantages related both to the limitations of a person’s mental abilities (memory, concentration, willpower), and allows you to think about targets within the chosen category, and also allows you to save time through the development of template patterns. To summarize, the advantages of the approach are:

  • Significantly simplified activity and perception of comfort.

  • Simplification of systematization and analysis.

  • Allows you to reduce prompt expressions preparation time.

In essence, the proposed approach underlies this methodology, making the activities of the “prompt engineer” many times more effective. Is this approach applicable to anyone?

<br/>

<br/>

4. Cognitive complexity as the basis of competence

Unfortunately, no, in order to build prompt expressions at a professional level, a person needs to have knowledge in a variety of relevant areas, in particular:

  • 3D visualization (physics of light, physics of materials, lighting construction methodology),

  • Photography (competence in the specifics of lenses, distortion, artistic techniques , fashion trends and interests of the population),

  • Art education (rules of perspective, understanding of the principles of constructing an art scene, competence in art directions, axonometry, Fibonacci number, etc.),

  • High socialization (for understanding fashion trends).

In essence, people who believe that AI will take away jobs from illustrators, programmers, analysts, etc., do not understand that the threat is no greater than the advent of the combine harvester in agriculture. Regardless of the level of automation, the technical task can be completed only by knowing how to formulate a request. In essence, everything is like Adam Douglas with "Deep Thought". :)

Does it follow from the description above that a person without an appropriate background will not be able to become a high-class prompt engineer, limiting himself tocopy -> paste? Of course not, competence in any field of activity develops over time, provided there is sufficient motivation and a systematic approach. Accordingly, it is reasonable to make the following recommendations for developing professionalism:

  1. It is necessary, at a minimum, to study 100% of the prompts variables used in scene at the conceptual level.

  2. It is necessary to acquire competence in related fields (3D visualization, photography, Art; to be more Socialized). :)

<br/>

<br/>

5. A theory is a hypothesis confirmed by practice

All sounds tempting, but does the proposed approach work? Let's check it out! :)

Before starting practical work, let's define the criteria for its value; how does useful work differ from useless idleness? What result do you need to be able to achieve so that in the future you could get a job or be able to sell your art?

The value of a specialist is determined by exactly one main quality, the ability to accurately implement the “technical specifications”!

Now, let's imagine that we are Peter Parker (Spider-Man), and our boss is J. Jonah Jameson (hereinafter, simply “the evil boss”). So our director gives us the following technical assignment:

“Peter, you must produce 100 images in 1 day generated by the SD model in accordance with the criteria of the marketing department. We need the “photo shoot” to include:

  • Tropical beach (20 images).

  • Ski resort (20 images).

  • Strip club (20 images).

  • Ancient Greek temple (20 images).

  • Fallout 4 (20 images)

Also, the marketing department gives you instructions to have 5 different characters for each photo shoot:

  • European.

  • African.

  • Chinese.

  • Indian.

  • Nauru with tattoos.

Each of the girls should be busty, curvy, with a huge butt.

At the same time, it is possible that the latest criteria may be revised and therefore you must be ready to make adjustments at any time.

With contempt for you,

your evil boss,

J. Jonah Jameson."

Technical specifications for PonyXL🦄

" My despicable lackey Peter, our marketing department has decided to enter the Gotham city market, where the requirements for printed publications are much more specific than in New York city. We found out that the consumer focus group in Gotham city requires much more candid photographs. Also at the new market we have serious competitor, Bruce Wayne, who has been publishing “play bat” magazine for many years.

So, I, as your boss, demand from you:

☐ In 4 hours, create 100 images that will depict sexual acts in 5 poses.

☐ All female characters should have an emo style, have huge breasts, but with different colored hair.

☐ All images must be photorealistic, illustrations and anime are not allowed!

With indignation towards you and with great hope for a reason to fire you (if you fail),

your extremely evil boss,

and greatest noble person,

J. Jonah Jameson."

What should Peter Parker do, create a prompt expression from scratch for each individual image, or is there another way? Let's see what happens if Parker uses the LSD (Layer Structural Defragmentation) methodology. :)

5.0.1 Analysis of the initial task.

Fortunately, our Peter Parker was not just an infantile photographer, but also a diligent student of the faculty of nuclear physics, which is why he approached the solution of the task very thoroughly. First, by defining a list of necessary knowledge needed on the path to the goal:

  • Technical documentation

  • Community Experience

  • Creation of a working theory, through the creation, testing and synthesis of hypotheses.

5.0.2 Creating an implementation plan

Before starting to develop any plan, you need to set goals for reasons of focus on results or failure. Consider a simple example, if a monkey uses a microscope as a hammer, does that mean the microscope is a “bad”, “useless” tool for cytology, or is there a problem with the monkey?

If we extrapolate the approach above to the way a prompt engineer solves problems, then we can identify typical objects of “indignation” that can be resolved through a gnoseological position. To simplify, the less we underestimate AI and overestimate our initial impressions, the more we can increase the effectiveness of our interactions with SD. :)

We must proceed from the fact that the absence of a result is actually a useful result that allows us to exclude erroneous hypotheses. Thus, with enough patience and curiosity, we can achieve the desired result, as opposed to the frustration and disappointment that impulsive behavior generates. :)

Implementation plan for the development and testing of LSDM:

  1. Сreate a detailed artistic description without analytical activity.

  2. Divide the resulting description into taxonomic categories within the framework of a Venn diagram.

  3. Creation of a complex prompt expression within the created logical structure.

  4. Check the repeatability of the result with a slight modification of the prompt expression.

  5. If the result suits us, we make more radical modifications until an “anomaly” occurs.

  6. Search for correlates of anomalies to determine patterns.

  7. Modification of the logical structure through the obtained patterns.

  8. Recursion from the 4th point into several iterations to obtain a result within the framework of the initial claims.

  9. We analyze activities for the purpose of objective conclusions.

5.0.3 Technical documentation on Stable Diffusion

We should take into account that in any field of activity, competence grows from the study of documentation, preferably academic. Otherwise, we will have to engage in research activities (as in the current article)

Below is a list of documentation currently available (as we can see, the current volume of materials, unfortunately, is quite meager and not very informative):

https://stable-diffusion-art.com/tutorials/

https://stable-diffusion-art.com/prompt-guide/

https://stable-diffusion-art.com/embedding/

https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features

https://www.animationguides.com/great-prompts-for-image-generation/

https://prompthero.com/stable-diffusion-prompt-guide

https://wiki.installgentoo.com/wiki/Stable_Diffusion

List of useful tools for creating prompts:

https://prompthero.com/stable-diffusion-prompts

https://chrome.google.com/webstore/detail/genalt-generated-ai-image/ekbmkapnmnhhgfmjdnchgmcfggibebnn

<br/>

5.0.3.1 Key principles for creating prompts in stable diffusion

Below are the basic principles of SD prompts, which can be compared with the syntactic minimum, the mastery of which is critical for prompt engineering:

1) The principle of Keyword weight formation:

  • a (word) - increase attention to word by a factor of 1.1

  • a ((word)) - increase attention to word by a factor of 1.21 (= 1.1 * 1.1)

  • a [word] - decrease attention to word by a factor of 1.1

  • a (word:1.5) - increase attention to word by a factor of 1.5

  • a (word:0.25) - decrease attention to word by a factor of 4 (= 1 * 0.25)

1.1) Keyword weight with fractional and negative coefficients:

  • their {word} = our (word:1.05)

  • their {{word}} = our (word:1.1025)

  • their [word] = our (word:0.952) (0.952 = 1/1.05)

  • their [[word]] = our (word:0.907) (0.907 = 1/1.05/1.05)

2) "Keyword blending" as a way to move from the base topology to the target one:

[monkey:Pikachu:0.9]
[monkey:Pikachu:0.5]
[monkey:Pikachu:0.2]

2.1) "Prompt blending", as a way to get the target result by combining samples

cinematic, Girl in treadwear with face like (Emma Watson:0.5), (Tara Reid:0.9), (Nicole Kidman:1.2)
cinematic, Girl in treadwear with face like (Emma Watson:0.9), (Tara Reid:1.2), (Nicole Kidman:0.5)
cinematic, Girl in treadwear with face like (Emma Watson:1.2), (Tara Reid:0.5), (Nicole Kidman:0.9)

2.2) Prompt matrix for Separate multiple prompts

a busy city street in a modern city | illustration | cinematic lighting

...

We should note that this article indicates only the key principles of scene construction; additional nuances can be found in the list of documentation above.

5.1.1 Construction of abstract layers as the basis of the LSDM approach.

To begin with, you need to take any quality image and think about the different aspects that go into the final result. This approach is very similar to the work of a director with a producer, where we need to select equipment for filming, imagine what the scene, character, etc. should be like. Below is my approach, you can modify it for your purposes.

Positive prompt layer structure:

☐ 1. Scene quality. (physical properties of lighting, reflections, method of constructing shadows, etc.)

☐ 2. Camera specification (comparative distance, angle, distortion level, ...)

☐ 3. Main Scenario (essential description of the main idea)

☐ 4. General details (list of abstractions defining the nuances of artistic design)

☐ 5. Scene decription (key factors for modifying the general set)

☐ 6. Background description (ambience, architecture, weather, etc.)

☐ 7. Artist style (modification of the entire scene in accordance with cultural phenomena)

5.1.2 Basic structure of a positive prompt

/1/ Masterpiece, (cosy atmosphere), (ultra high res, mj,best quality),(highly detailed),(cinematic),(volumetric lighting, volumetric shadows),(cozy nsfw), Amazing, (finely details),(photorealistic:1.4), warm lighting,ray tracing,dynamic lighting,(extremely clear facial details:1.4),
/2/ (POV full body photo), from below, professional photography,dynamic angle shot,(50mm Sigma f/1.4 ZEISS lens, F1.4, 1/800s, ISO 100, photograpy:1.1),
/3/ (Nude, beautiful woman walks her right leg takes a step),
/4/ (The most beautiful 1 woman), (Realistic skin texture), (gigantic breasts;huge breasts;enormous lactating tits;The best enormous breasts and breast shape is hemispherical:1.4), (seductive eyes;expressive eyes;blue eyes),(puffy nipples, shiny skin), (The best Super huge ass, Nice figure), beautiful face, detailed hands, (thick thigh:1.28),(high-heels:1.4),(seductive posture;standing seductively), (palms rest on hips:1.4),(blonde hair;very short haircut:1.4),
/5/ looking at viewer, solo,
/6/ (tropical beach with white sand and beautiful waves reflecting the setting sun:1.3),
/7/ Instagram style,Official Art,

5.1.3 Construction of abstract layers of negative prompt.

First of all, in this case, practice with a set of statistics can help us, where, observing typical errors of models, we can develop a list of recommendations for eliminating them. Also, by observing the discrepancy between the result and our expectations, we can reduce the number of interpretations that we do not need:

Negative prompt layer structure:

☐ 1. Excluded type of the art (situational specification)

☐ 2. Excluded Image Quality Criteria (a priori criteria)

☐ 3. Excluded Camera Settings (subset of unwanted interpretations)

☐ 4. List of anatomical defects (list of anomalies and pathologies that do not often have artistic value)

☐ 5. Undesirable properties for a specific scene (that is, there are situational criteria that can sometimes be useful)

☐ 6. Prohibited styles (here you can adjust the result ranges by excluding subsets)

5.1.4. Basic structure of a negative prompt

/1/ (semi-realistic), (illustration, painting, cgi, 3d, 2d, render, sketch, cartoon, drawing, anime:1.2),
/2/ (worst quality;low quality;normal quality:1.4), (lowres,monochrome,grayscale), (EasyNegative), (paintings), (jpeg artifacts), (blurry, low resolution, poorly drawn,)
/3/ (out of view, close up, cropped, out of frame), 
/4/ (bad anatomy;bad proportions;wrong anatomy;cut off;ugly;deformed;mutation;mutated:1.4), (mutilated extra fingers,mutated hands,poorly drawn hands), (fewer fingers,mutated hands and fingers:1.4),(ugly eyes, deformed iris, deformed pupils, fused lips and teeth:1.2), zombie, (dead eyes, cataracts, cloudy eyes), (poorly drawn face;cloned face;dehydrated),(extra limb;missing limb;floating limbs;disconnected limbs), (deformed;distorted;disfigured:1.3), (disgusting,amputation,bad-hands-5:1.4),
/5/ (blurry skin;blurry faces;blurry girl's details:1.4), (skin spots,skin blemishes,acnes,scars,wrinkles,warts,pimples,) (old,age spot,ugly), (disfigured,veins), (muscles,pubic hair,body hair,aged skin,skin creases:1.4),  
/6/ (simple background:1.0), (bad-picture-chill-75v, bad_prompt_version2-neg, By bad artist -neg,easynegative, negative_hand-neg, ng_deepnegative_v1_75t, verybadimagenegative_v1.3:1.1), (text),

<br/>

5.2 Referral materials and a method for creating complex prompts

Below is a list of articles related to the character’s pose, where this liver should be supplemented with methods of applying cosmetics, etc.

https://www.lancereis.com/photography-tips-for-beginners/27-posing-ideas-for-women-who-arent-models

https://pashabelman.com/best-model-poses

https://en.wikipedia.org/wiki/List_of_gestures

https://www.bryndonovan.com/2015/04/05/master-list-of-facial-expressions/

<br/>

5.3 Checking the predictability of the LSDM result criteria by changing the generalized properties of the character

So let's check if the distribution of prompts through layers of abstractions works within the framework of the LSDM approach, where, while maintaining the criteria for the external build of the body, we need to change only the ethnicity of the character:

/3/ (Nude, beautiful white european woman walks her right leg takes a step),

/3/ (Nude, beautiful asian chinese woman walks her right leg takes a step),

/3/ (Nude, beautiful african afro-american woman walks her right leg takes a step),

/3/ (Nude, beautiful indian woman walks her right leg takes a step),

/3/ (Nude, beautiful) (nauru tribe woman with tribe tattoo on skin and breasts:1.4) (walks her right leg takes a step),

Result here: https://civitai.com/posts/1736150

Conclusion: The LSDM approach allowed us to fully implement the task criteria with minimal effort. At the same time, pay attention to the increased accent of the Nauru woman “:1.4”; without this measure, none of the models would have displayed the correct result.

<br/>

5.3.1 Checking the predictability of LSDM results under radical changes in the scene, where we change the global environment.

Let's check how the result changes (we are interested in the criteria for technical specifications) on: a tropical beach, a ski resort, a nightclub and a Greek temple.

/6/ (tropical beach with white sand and beautiful waves reflecting the setting sun:1.3),

Result here: https://civitai.com/posts/1736150

/6/ (ski resort with  fir trees and rocks behind snow slope reflecting the setting sun:1.3),

Result here: https://civitai.com/posts/1736263

/6/ (nightclub with a pole in the middle of the stage, everything is lit with strobe lights of different colors, at the background in the darkness of the audience's chairs:1.3),

Result here: https://civitai.com/posts/1736350

/6/ (Ancient Greek temple with columns and ruins of statues, colonnades reflect the sunset light:1.3),

Result here: https://civitai.com/posts/1736431

Conclusion: The LSDM approach allowed us to fully implement the task criteria with minimal effort.

<br/>

5.3.2 Testing LSDM in a more complex scene

Let's try to apply LSDM for a scene where, while maintaining the "technical specifications" criteria, we radically change exactly one variable and integrate the character into Fallout 4:

/6/ (post-apocalyptic desert of Fallout 4, from desert cactuses, abandoned cars and corpses in power armors reflecting the sunset rays:1.3),
/7/ Fallout 4 style,Official Art,

Result here: https://civitai.com/posts/1736509

and here: https://civitai.com/posts/1736745

Conclusion: The LSDM approach allowed us to fully implement the task criteria with minimal effort.

<br/

5.3.3 Testing LSDM in a more complex scene with additional 2 persons:

The proposed methodology works great when adding multiple characters, the only thing you need to do is edit those lines that contradict the original scene with one character.

/3/ (Nude, most beautiful Scandinavian women:1.4), (behind her 2 male mans in fallout4 T-45 power armor with closed helmets:1.5),
/5/ looking at viewer, (1girl behind her male 2mans),
/6/ (post-apocalyptic desert of Fallout 4:1.4), (from desert cactuses, abandoned cars and corpses in power armors reflecting the sunset rays:1.4),
/7/ Fallout 4 style,Official Art,

Result here: https://civitai.com/posts/1736613

Conclusion: The LSDM approach allowed us to fully implement the task criteria with minimal effort.

<br/

5.3.4 Testing LSDM in scene, where, we will try to change the character's pose.

3.  (Nude, beautiful) (Scandinavian) (woman with facial expression of kiss lips:1.4), (seductive posing with hands up under her head:1.5), 
6.  (Three-point lighting, blurred orange studio Backdrop:1.3),
7.  professional studio photography style,Official Art,

Result here: https://civitai.com/posts/1737434

Conclusion: The LSDM approach allowed us to fully implement the task criteria with minimal effort.

<br/

5.3.4 Testing LSDM in scene, where we are trying to change the faces of the characters in accordance with the given parameters.

This problem can be solved quite easily if you deliberately indicate emotions in the prompt (smile, seductiveness), as well as specify the category of the person (Scandinavian, the most beautiful of Insta, TikTok, etc.). Let's test the proposed approach:

/3/ (Nude, most beautiful Scandinavian women:1.4), (seductively laughing:1.4), (powerful jaw and cheekbones:1.4), (muscular face:1.4), (green eyes:1.4), (walks her right leg takes a step),

Result here: https://civitai.com/posts/1737144

/3/ (Nude, most beautiful Instagram women:1.4), (seductively laughing:1.4), (snub-nosed:1.4), (walks, her right leg takes a step),

Result here: https://civitai.com/posts/769065

/3/ (Nude, most beautiful Instagram women:1.4), (seductively laughing:1.4), (walks her right leg takes a step),

Result here:

/3/ (Nude, most beautiful German women:1.4), (seductively laughing:1.4), (muscular face:1.4), (blue-eyed:1.4), (snub-nosed:1.4), (walks, her right leg takes a step),

Result here: https://civitai.com/posts/1737094

/3/ (Nude, most beautiful Tiktik women:1.4), (snub-nosed:1.4),(seductively laughing:1.4), (green eyes:1.4), (walks her right leg takes a step),

Result here: https://civitai.com/posts/1737042

In the last version of the prompt, “Tiktik” is not a typo, for some reason “TikTok” SD models are given connotations of a watch or a clock mechanism, and with a typo they interpret the request correctly.

Conclusion: The LSDM approach allowed us to fully implement the task criteria with minimal effort.

<br/

<br/

5.3.5 Using LSDM in accordance with circuit diagram of stable diffusion

We should understand that the principle of operation of SD is the construction of arbitrary images (through prompts) based on a random associative series (noise). In this case, three factors are of key importance:

1.) The order of distribution of prompts.

2.) The presence of accents in the prompt.

3.) Taking into account the stack of image databases on which the model was trained (Insta, TikTok, etc.)

The described principles are of utmost importance for the LSDM approach, where it is desirable to arrange prompts in hierarchical order, since the model is trained on a series of images, where one abstraction is not always separated from another.

<br/>

<br/>

5.3.6 Construction of PonyXL🦄 prompt's abstract layers (as basis of LSDM approach).

As in the case of SD_1.5-2, before creating a complex prompt expression, it is necessary to create the logic for the taxonomic catheterization of prompts in order to be able to modify the scene with minimal effort. Where this approach allows you to easily vary both specific variables and entire sets in accordance with the necessary criteria. Taxonomic trees can also be considered as documentation for developing a prompt expression in cooperation with other people. At the same time, I would like to draw your attention once again to the fact that in any multifactor system, to find correlates, you can change only one variable at a time. So, as an option, I suggest the following structure for creating a PonyXL prompt:

☐ 1. Scene quality. (physical properties of lighting, reflections, method of constructing shadows, etc.) and

☐ 1.1. Camera specification (comparative distance, angle, distortion level, ...)

☐ 2. Main Scenario (essential description of the main idea)

☐ 3. General details (list of abstractions defining the nuances of artistic design)

☐ 4. Scene decription (key factors for modifying the general set)

☐ 5. Scene decription (key factors for modifying the general set)

☐ 5.1 Background description (ambience, architecture, weather, etc.)

☐ 5.2 Artist style (modification of the entire scene in accordance with cultural phenomena)

<br/>

5.3.7 Creating a prompt expression "LSDM" (based on PonyXL🦄 derivatives) in order to create the most photorealistic images.

First of all, it is worth noting that the original PonyXL was originally designed for creating images in the style of anime, manga and illustrations, which imposed severe restrictions on the ability to create realistic images. Potentially, part of the problem was solved in the "Everclear PNY by Zovya" model, however, the basic result without the use of complex prompts is very far from ideal. Accordingly, in order to try to solve the problem at the beginning, we need to collect a sum of predictors in order to derive a number of abstractions that distinguish an illustration from a photograph. So, below, I share with you a list of observations and useful links, based on which, I was able to solve the problem as far as possible within the limit of free time (one Sunday day). In general, the process of collecting predictors is similar to laying out Lego cubes, where, with enough of them, you can create a wonderful design. :) Here's what we have:

☐ Eyes (iris), many users use fairly simple prompts, such as “blue eyes”, without paying attention to the imperfections of this prompt, as it turned out, the use of more figurative prompts significantly improves the result (for example, “agate eyes”): https://www.reddit.com/r/Damnthatsinteresting/comments/128jwdn/different_iris_colors/

☐ The physical properties of materials, as the most important set of abstractions that distinguishes illustration from photography. If you have ever worked in a 3D editor (blender, 3ds max, Maya, etc.), then you know how strikingly different the appearance of the materials of 3D objects in the viewport is compared with the result after rendering. In this example, we can find direct connotations between the artist’s design of the drawing and objective physical reality with a lot of laws in the field of optics, which the illustrator does not take into account or deliberately ignores (to reduce labor costs). Let's look at a few of the most basic principles:

☐ Refractive index (ior): https://www.behance.net/gallery/35636521/Material-Studies-Metals https://pixelandpoly.com/ior.html

☐ Сoefficient of reflection/transparency/roughness: https://creativecloud.adobe.com/learn/substance-3d-designer/web/3d-model-materials-and-shaders?locale=en

☐ Fresnel Effect: https://www.dorian-iten.com/fresnel/

☐ Type of lightnings and shadows: https://colinbarrebrisebois.com/2015/11/06/finding-next-gen-part-i-the-need-for-robust-and-fast-global-illumination-in-games/ https://computergraphics.stackexchange.com/questions/337/radiosity-vs-ray-tracing

☐ Role of HDRI in visualization: https://polyhaven.com/hdris

☐ Some practical examples of using the physical laws described above:

☐ Prompt: "HDRI glares in the eyes" - allows you to create complex reflections of the global environment as opposed to primitive light spots.

☐ Prompt: "hyperrealistic wet skin with HDRI glares at skin pores" - see above

☐ Prompt: "all shaded surfaces always clearly reflecting like mirror" - allows you to create a realistic effect on the character’s skin, including (indirect illumination, refractive index, Fresnel Effect)

☐ Understanding the DOF principle: https://shotkit.com/depth-of-field/

☐ An extremely useful research tool for finding factors responsible for photorealism in PonyXL: https://prompthero.com/search?q=skin

☐ To change poses, facial expressions, facial types, anatomical topology, it is quite convenient to use a state matrix, in the form of a list or an Excel table, which can allow you to achieve a given result without the need to spend time on search activity. For example, if I had a database of “NSFW poses”, I could create a free utility by uploading it to GitHub, but alas, the best we have so far is a good article by AkiEvans, who created a small collection of NSFW poses PonyXL🦄: https://civitai.com/articles/3600/lora-collection-nsfwcowgirl

5.3.8 Basic structure of a "Everclear PNY by Zovya"🦄 positive prompt:

I would like to immediately draw your attention to the fact that there is a possibility that a prompt expression may contain variables (although there are not many of them) that can be excluded without loss of quality. Since Sunday (intended for hobby) is ending, I did not have time to abstract unnecessary variables (aggregated those from prompt hero). At the same time, I would like to draw your attention to how the prompt expressions below directly correlate with the physical laws from the field of optics, which I described above.

/1/ (score_9, score_8_up:1.4), score_7_up,  Cinematic, documentary, precision photography, luminous lighting, stunning use of shadows, in the style of crisp lines and forms, UHD, photorealistic, high detail, shot on an ARRIFLEX 35 BL Camera, Canon K35 Prime Lenses, ethereal landscape, 70mm --ar 1:2 --stylize 1000, 
/2/ (full body photo male pov:1.5), (1girl, beautiful white girl, tanlines with emo tattoos, white skin, emo tattoos),(big penis between breasts, ejaculation), (girl licking tip of enormous giant veiny throbbing penis:1.2),
/3/ (extremely gigantic nude breasts with emo tattoos:1.5), (blush, perfect agate eyes, realistic,1girl,shaved hair, emo, blue lipstick, shiny redhead hair, naked, loving,looking at viewer, blue mascara, bimbo lips, blue lipstick, deep agate eyes, perfect emo makeup:1.5), (highly detailed eyes, HDRI glares in the eyes:1.4),
/4/ (hyperrealistic wet skin with HDRI glares at skin pores:1.5), (raytraced reflections and shadows:1.5), extremely DOF, goddess & temptress, candid, golden hour, cinematic lightning, extreme realism,  stunning, delicate features slight smile, delicate brushstrokes, 
/5/ bright sunny day, HDR, vivid, rich details, clear shadows and highlights, realistic, intense, enhanced contrast, highly detailed, evening atmosphere, (all shaded surfaces always clearly reflecting items nearby like mirrors:1.5), (subcutaneous reflection), (IOR), (extreme antialiasing), (in the style of photorealism:1.5),

<br/>

5.3.9 Testing LSDM "Everclear PNY by Zovya"🦄 prompt in scene, where, we will try to change the character's pose and appearance features.

Friends, I advise you to be observant in order to understand how the prompts I invented from the field of optics affect the image quality. Also, try to compare the repeatability of generalized criteria on a stack of images with changing the second point from the LSDM structure:

/2/ (male pov, 1girl, beautiful white girl, tanlines with emo tattoos, white skin, emo tattoos),(big penis between breasts, ejaculation), (girl licking tip of enormous giant veiny throbbing penis:1.2), 

Result here: https://civitai.com/posts/2424926

/2/ (full body photo, male pov:1.5), (1girl, beautiful white girl, extremely busty, tanlines, white skin, emo tattoos),(big penis between breasts, transparent ejaculation), (breast grab, he squeeze her both breasts:1.4), (awaitingtongue, girl licking tip of enormous giant veiny throbbing penis:1.2),

Result here: https://civitai.com/posts/2425413

/2/ (full body photo, male pov:1.5), (1girl, beautiful white girl, extremely busty, tanlines with emo tattoos, white skin, emo tattoos),(hipgrabcowgirl, hetero, vaginal, (pov hands, torso grab:1.5),

Result here: https://civitai.com/posts/2425480

/2/ (full body photo male pov:1.5), (1girl, beautiful white girl, tanlines with emo tattoos, white skin, emo tattoos), (a girl with breasts and puffy nipples on boy's hips, she bent over, ass up,cock worship, grab big cock, admiring a cock, submissive, sitting:1.4),

Result here: https://civitai.com/posts/2425842

/2/ (full body photo, pov boy:1.5), (1girl, 1boy, beautiful white girl, extremely busty, tanlines, white skin),(1girl, extremely busty, nude, bubble butt), (cowgirlpose, sex from behind, she shows breasts to viewer by turning torso, boy's hips and feets under girl's ass, giant veiny penis in girl's vagina:1.5),

Result here: https://civitai.com/posts/2426417

Observation: Despite the fact that the proposed prompt expressions made it possible to bring the images closer to photorealism, the result is far from ideal. The problem is both irregular repetition and insufficiently clear drawing of contours and rather mediocre texturing of the skin (it resembles 3D), there is also no photorealistic grain in the images and the lighting does not take into account all the requirements for the physics of light.

Conclusion: The basic model is not perfect and needs improvement through LORA.

5.3.9.1 Searching and testing LORA models to achieve photorealism in the environment "Everclear PNY by Zovya"🦄.

To achieve this goal, I tested almost 100% of the LORA presented on the site and, to my satisfaction, found 2 models (complementing each other) that fully meet the criteria of the “technical specifications”:

✔ EpiCRealism - Embeddings: https://civitai.com/models/89484/epicrealism-embeddings?modelVersionId=95263

✔ epiCPhotoGasm Style Negatives: https://civitai.com/models/132719/epicphotogasm-style-negatives?modelVersionId=145996

Also, you can improve the result by using the following 3 LORAs with skin textures (use them separately to avoid conflicts):

✔ Skin Realism (Acne, Skin Details, Imperfections) SDXL: https://civitai.com/models/248951?modelVersionId=340833

✔ Pale Skin SDXL: https://civitai.com/models/408526?modelVersionId=455384

✔ Freckles & Skin XL: https://civitai.com/models/285909?modelVersionId=321609

To improve lighting (optional), you can use the following 2 LORA models (use those separately to avoid conflicts):

✔ Zavy's Light Trails - SDXL: https://civitai.com/models/365364?modelVersionId=408304

✔ Midjourney V6 Style (experimental) https://civitai.com/models/248701?modelVersionId=280621

Note: LORA Midjourney, although it presents warm light and soft shadows like the original, however, practically does not respond to the specifications of photographic equipment like “ARRIFLEX 35 BL Camera, Canon K35 Prime Lenses, 70mm”. Naturally, Midjourney learns from image metainformation, which means it will be cool if someone creates LORA with a similar approach to learning! :)

The diagram below provides a visual demonstration of the difference in results between LORA (skin and light):

Below is a prompt expression for the test and a clear demonstration of the advantages of LSDM, where we only changed the appearance and pose of the character, retaining all configurations from points 1,3,5:

/2/ (male pov, 1girl, beautiful white girl, tanlines with emo tattoos, white skin, emo tattoos),(big penis between breasts, ejaculation), (girl licking tip of enormous giant veiny throbbing penis:1.2),
/3/ 1girl,stunning ,supermodel, perfect composition, (heavy breasts:1.3), (beautiful 28 years old blonde woman, pouty lips, perfect butt:1.3), narrow waist, perfect thighs, (snub-nosed, thin nose, perfect face, absolute beauty, blonde hair, buns:1.5), (moaning with eyes closed), (braids hair, freckles, freckles and moles all over body, extremely huge breasts:1.5),

The result is here: https://civitai.com/posts/2473400

Conclusion: Using LORA with "Everclear PNY by Zovya"🦄 allowed us to fulfill the most important point of the technical specifications:

✔ All images must be photorealistic, illustrations and anime are not allowed!

5.3.9.1 Testing the predictive power of the LSDM prompt in a selected environment ("Everclear PNY by Zovya"🦄 + LORA), where the goals are as follows:

☐ Checking the preservation of the developed concepts within the framework of LSDM with a radical change in the scene.

☐ Checking the possibility of radical modification of the scene by changing the values ​​of a minimum number of variables.

Note: usually, I always post images that are strictly relevant to the test, but in this case I will make an exception in order to create 5 short stories. At the same time, the number of scenes will exceed the number of tests, since I don’t have time to write in ~10 tests (in any case, the results are positive everywhere). :)

Test #1

  /2/  (hipgrabcowgirl, hetero, vaginal, breast grab, he squeeze her both  breasts:1.4), (partially underwater, head above very cloudy water:1.5),
  /5/  (at bg volume lights of the sun, corals reefs, fishes in the cloudy ocean, golden sky, island:1.5),

The results is here: https://civitai.com/posts/2506477

Test #2

  /2/ .., (big penis between breasts, ejaculation), (underwater of ocean:1.5),
  /5/ blue light in the dark, ..., at bg sun through water, air bubbles, corals reefs, jellyfish, octopuses, fishes:1.5),

The results is here: https://civitai.com/posts/2506547

Conclusion: the results almost always correspond to the technical specifications, with the exception of relatively rare artifacts.

So, friends, now you have an LSDM tool to create almost any NSFW photorealistic quality! :)

<br/>

5.3.9.2 Creating universal LSDM negative prompt for "Everclear PNY by Zovya V3.0"🦄 through the study of "pony anatomy"!💀

It's probably no exaggeration to say that this chapter is key to developing fully photorealistic images in the PonyXL and derivatives environment!

✔ Friends, we are lucky, I bring to your attention the documentation on PonyXL, where in particular Hashed Tokens are presented (each of which is associated with a specific image array for training): https://rentry.org/ponyxl_loras_n_stuff#reverse-engineered-hashed-tokens

✔ And here is the complete (probably) list of PonyXL tokens: https://files.catbox.moe/41sbn0.txt

aav,aax,aba,aca,acb,acl,acm,acs,aee,aef,aek,aer,aet,aeu,aew,aex,aey,aff,aga,agi,ago,ahk,ahl,ahz,aij,ain,aiu,ajm,aju,ajy,akd,ake,aki,akk,akm,akr,aku,ali,alp,amu,ana,ani,anu,aoa,aob,aoj,aov,aox,aoy,api,apm,apo,aqe,aqg,aqu,aqx,arb,aro,asa,asm,asn,aso,aua,aur,auv,awd,awf,awm,awv,axp,ayb,ayl,ayp,ayq,ayv,ayw,ayy,aze,azv,baf,bbq,bcg,bdc,bdr,bem,bfb,bfg,bfq,bfu,bfv,bgf,bgk,bgn,bgv,bha,bhb,bhl,bhr,bhz,bif,bih,bim,bip,biy,bjp,bke,bkm,bks,bku,bkx,bna,bnp,bnv,bol,bom,bor,bou,bpb,bpc,bpw,bpx,brd,brk,brl,brn,brp,brr,brs,brw,brx,bry,brz,bse,bsl,bsv,bub,bur,bvk,bvm,bvq,bwf,bwl,bwt,bwu,bwy,bxh,bys,bzi,bzl,bzm,cad,cak,cbr,cbu,cch,cdr,cds,cgv,chl,ciu,cle,cln,cly,coh,coi,coy,coz,crb,crr,csb,csf,csz,cte,cwn,cxd,cxg,cxh,cxl,cxw,cxz,cyq,cyu,czi,dap,dbg,dbj,dbu,dbw,dcd,dce,dch,dck,ddb,ddk,ddp,deh,dfd,dfe,dfk,dfm,dfo,dhg,dhl,dih,dit,dja,djv,dkd,dkg,dki,dko,dkr,dks,dkt,dku,dkv,dkw,dky,dlv,dmb,dmf,dmg,dmj,dmk,dmp,dnw,dpa,dpb,dpc,dpf,dph,dpj,dpk,dpn,dpo,dpz,dsh,dsk,dso,dtb,dtc,dtd,dth,dtt,dtu,dtv,dty,dtz,dvs,dwc,dwn,dww,dwx,dwy,dxo,dxs,dxv,dyu,dyv,dza,dze,dze,ebo,ebp,ebu,efk,egb,egb,egk,egv,egx,ehb,ehf,ehh,ehr,ehx,ehz,eim,ejt,eka,eke,eki,eky,ela,ema,emc,eoa,eob,eod,eou,eov,eoy,eqb,eqc,eqg,eqr,eqt,eti,euk,eum,evg,ewi,ewo,ewu,ewy,exl,eza,ezo,ezy,fai,fay,fbg,fbu,fbv,fbw,fdv,fdw,fdz,fei,fem,fey,ffs,fgd,fgk,fgq,fgv,fgz,fhb,fhl,fhy,fii,fjt,fju,fjv,fjx,fke,fkm,fku,fkw,fla,fln,fpb,fpw,fpx,fpz,fqx,fru,frv,frw,fsb,fsd,fsf,fso,fsp,fsv,fvb,fvd,fvm,fvn,fvs,fvv,fvx,fwh,fwt,fwx,fwy,fxc,fxd,fxv,fyu,fyx,fyy,fzj,fzl,fzm,fzv,fzw,gad,gaf,gar,gax,gbu,gcd,gcg,gch,gcx,gdr,gea,ght,gjt,gjv,gjw,gkb,gkr,gmq,gmz,goj,gom,gor,gou,gpc,gpj,gpn,gpo,gpw,gpx,grb,grp,grt,gsb,gsf,gsh,gsu,gtv,gtz,gvb,gvt,gwg,gwh,gwl,gwm,gwt,gwv,gwy,gwz,gxh,gxm,gyy,gzl,gzm,gzr,gzw,hag,hai,haj,haz,hbz,hcd,hch,hda,hdr,hep,hga,hgt,hgv,hij,hik,hiq,hiu,hjt,hka,hke,hki,hku,hlg,hlk,hll,hlt,hlu,hmj,hmp,hna,hnj,hns,hnu,hpb,hpw,hpx,hqr,hsk,hsn,htm,htv,hua,hui,hvi,hvy,hwa,hwd,hwh,hwj,hwl,hwu,hwv,hwz,hxh,hya,hzj,hzl,hzm,hzt,iao,iaw,ibw,idz,ieb,iee,iel,iew,ifl,iga,igh,igo,igu,iha,ihb,ihc,ihh,ihl,iho,ihp,ihr,ihv,ihw,ihz,iia,iim,iin,iio,iiy,ijb,ijd,ije,ijg,ijh,iji,ijk,ijl,ijm,ijp,ijq,ijs,ijv,ijw,ijx,ijy,ijz,ikf,ikm,ikp,iku,iky,ilb,ilg,ilp,ilr,ima,imf,imo,inc,ior,ipi,iqt,iri,iro,iru,iry,ito,iuc,iud,iue,iui,iuk,iun,ivh,ivm,iwg,iwj,iwl,iwo,iwp,iwt,iwu,iwv,iww,iwy,ixb,ixe,ixz,iyb,iyi,iyo,iyu,jaf,jah,jaj,jap,jbc,jbg,jbj,jbm,jcd,jch,jcp,jcy,jdd,jdg,jds,jel,jfa,jfb,jfe,jfm,jfn,jgd,jgk,jgm,jhp,jhy,jio,jju,jjv,jjz,jke,jkg,jki,jkv,jkw,jlk,jln,jlv,jme,jmf,jmj,jms,jmv,jnj,jnl,jpo,jpw,jpx,jqr,jrm,jrn,jrq,jru,jsf,jsm,jso,jst,jsv,jtj,jtm,jtv,juh,jui,jun,jvb,jvi,jvj,jvm,jvn,jvs,jwh,jwl,jwt,jwv,jww,jxd,jxh,jxm,jyk,jza,jzd,jze,jzg,jzj,jzl,jzm,jzp,kab,kcd,kch,kdg,kdk,kdr,kds,kga,kgd,kgq,kgv,kgw,khq,kib,kig,kih,kjt,kjw,klg,kll,klm,kln,klo,kmj,kmp,kmq,kmu,kmw,kmz,kna,koi,koo,kou,kpb,kpl,kpm,kpw,kqr,kqx,ksb,ksd,ksf,ksg,ksh,kuh,kuu,kvk,kvl,kvm,kvx,kwl,kws,kwv,kwy,kxf,kxg,kxl,kxm,kyg,kyy,kzf,kzg,kzl,kzm,kzr,kzs,kzt,kzw,lap,lbi,lbj,lbk,lbo,lbp,lbq,lbu,lbv,lbw,lcf,lcm,lcn,lcp,lcv,ldu,ldv,lek,lfh,lgu,lgv,lhb,lhc,lhh,lhy,lia,ljw,lkb,lkf,lkg,lkr,llq,lmb,lml,lmx,lmy,lmz,lnf,lnh,lnp,lnq,lnv,lnw,loi,lox,lpb,lpc,lpm,lpn,lpt,lpw,lpx,lqf,lql,lqx,lrl,lru,lsc,lsf,lte,ltr,ltv,lus,lux,luz,lvm,lvu,lwb,lwh,lwl,lwn,lwq,lwu,lwy,lwz,lxb,lxh,lym,lyn,lyr,lzg,lzj,lzl,lzt,lzy,lzz,mbb,mbg,mbo,mdf,mdg,mdh,mdl,mdo,mdr,mdv,mdw,met,mey,mha,mhb,mhf,mhg,mhj,mhk,mhp,mhv,mhx,mhy,mii,mio,mjb,mjm,mjy,mkb,mkg,mkl,mkx,mlx,mmo,mmr,moc,mpa,mpf,mph,mpj,mpk,mpl,mpn,mpq,mpr,mpt,mpu,mpv,mpw,mpz,mru,msh,msy,mtd,muh,mui,mul,mup,mur,muu,muy,mwb,mwf,mwi,mwn,mwq,mwt,mwz,mxj,mxu,myr,myu,mzg,nan,nar,nax,nbg,nbi,ncb,ncc,ncd,nch,ncl,ncp,ncv,ncx,nda,ndr,ndx,nev,nfd,ngv,nhd,nhk,nhp,nhu,nhv,nhz,nia,nie,nii,nin,nir,nis,niu,nke,nkf,nki,nkk,nko,nku,nkv,nkw,nlo,nlv,nmb,nmp,nmu,nmz,nna,nox,npb,npn,npw,npx,nqr,nqx,nrf,nrg,nrh,nsb,nsc,ntd,nto,nts,ntu,ntv,ntz,nvg,nvi,nvj,nvk,nvl,nvo,nvu,nvv,nwm,nwn,nwy,nyi,nyj,nyk,nyp,nyr,nyy,nzb,nzo,oaa,oat,oav,oax,obo,obu,oca,ode,odh,odk,odl,odp,odr,oee,oel,oey,ofa,ofp,oge,ogf,ogk,ogl,ogr,ogv,oha,ohg,ohv,ohw,oia,oib,oih,oii,oim,oip,oir,oix,ojb,ojn,ojt,ojv,ojw,oka,okf,olu,ome,omi,omo,omu,omv,onz,ooh,oou,oov,opb,opg,opk,opl,opq,opv,opw,orh,ori,ose,ota,ott,otv,oue,owb,owf,owg,owh,owi,owz,oxz,oya,oyj,oym,oyq,oyu,oyv,oyy,oyz,oza,ozo,paf,pag,par,pbc,pbi,pbv,pbw,pcd,pdg,pdk,pdl,pdn,pdo,pgm,pgw,pha,phy,pjy,pkm,pku,pln,pme,pmj,pmk,pml,pmp,pnf,pon,poo,ppp,pri,psf,psm,psp,ptj,pvo,pvs,pwh,pwl,pwn,pwt,pwy,pxg,pxh,pxo,pyb,pyh,pyq,pyy,pyz,pzl,pzm,pzp,pzw,qag,qak,qar,qaw,qaz,qbg,qbu,qbv,qbw,qbx,qcd,qch,qci,qcq,qcy,qcz,qdc,qdg,qdk,qdl,qdr,qgk,qgm,qgq,qgs,qgv,qgy,qgz,qha,qhb,qhh,qhp,qhr,qhy,qhz,qia,qji,qjl,qjt,qju,qjv,qjw,qjx,qjy,qkp,qkr,qlh,qlj,qlt,qmj,qml,qmp,qmu,qnj,qob,qoc,qoe,qoy,qoz,qpb,qpp,qpw,qpx,qqf,qqr,qqt,qqv,qqx,qri,qrj,qrk,qrp,qru,qsf,qsv,qtj,qvj,qvn,qwg,qwh,qwl,qwn,qwt,qwu,qwv,qwy,qxh,qxm,qxq,qxs,qym,qyp,qyt,qzl,qzm,qzo,qzt,rak,rbh,rbi,rbj,rbm,rbq,rbv,rbw,rbx,rbz,rcd,rcf,rch,rea,rek,rga,rgx,rha,rhc,rhh,rhn,rhv,riu,rjg,rjt,rjy,rjz,rkf,rkg,rkq,rkx,rmv,roc,rou,rov,rpw,rpy,rpz,rra,rrg,rsd,rsl,rsn,rss,rui,rup,rwy,rxb,rxg,rxh,rxj,rxk,rxw,rxz,ryb,ryn,rys,rza,rzj,rzl,rzm,sae,saz,sbk,sbl,sdr,seb,seu,sfv,sfw,sfy,sgh,sgy,sha,shq,sht,shu,sid,sij,siu,sjb,sjc,sjd,sje,sjf,sjg,sjh,sji,sjj,sjk,sjl,sjm,sjp,sjq,sjs,sjt,sju,sjv,sjw,sjx,sjy,sjz,skd,sko,sku,slu,sme,smf,smg,smh,smj,smk,smk,sml,smn,smp,smr,smv,smz,sog,soh,soi,soj,sot,sou,spe,sph,srf,srg,srn,srr,srs,sru,srv,srx,ssp,stj,stk,sud,swf,swg,swl,sww,syn,syu,szo,szw,taj,tal,tat,tcg,tcj,tcl,tcv,tdc,tdj,tdr,tds,tdz,tet,tfv,tgt,thn,tir,tiv,tjt,tju,tke,tkw,tle,tlv,tmu,tnb,tnj,tnl,tnn,tnp,tnr,tnu,tnv,tnw,tpa,tpb,tpc,tpn,tpw,tpx,tqx,tsu,ttp,tvg,tvu,tvx,twb,twi,twu,tww,tyb,tyr,tyv,uaa,uab,uag,uai,uan,uao,uap,uar,uaw,uaz,ube,ubg,ubj,ubk,ubv,ubw,uca,uch,uco,ucs,udr,uds,uea,uee,uef,ufa,ufb,ufd,ufg,ufo,ufs,ufv,ufw,ufy,uga,ugu,ugy,uha,uhf,uhi,uhl,uhp,uhr,uhy,uie,uim,uio,uip,uiw,uix,ujf,ujg,uji,ujj,ujn,ujs,ujt,uju,ujw,ujx,ujy,uks,ula,ulb,ulc,ulg,ulh,ulj,ulk,ulm,uln,ulp,ulq,ulr,uls,ulv,ulw,ulx,ulz,umb,ume,umf,umh,umj,umk,uml,umn,umo,ump,umr,ums,umv,umx,umy,uno,uob,uoe,uog,uop,uou,uov,uoy,uoz,upl,uqa,uqb,uqc,uqi,uqt,uqx,ura,urd,uru,usu,utu,uua,uub,uuc,uud,uue,uuf,uuh,uui,uuk,uum,uun,uuq,uva,uvb,uvd,uvi,uvm,uvo,uvs,uvt,uvy,uwe,uwh,uwl,uwo,uwp,uws,uws,uwt,uwy,uxd,uxi,uyd,uye,uyf,uym,uyz,uzo,uzu,uzv,uzw,vag,var,vbb,vbg,vbi,vbm,vbu,vcd,vch,vcv,vdc,vdl,vdq,vdr,ven,vew,vex,vey,vfc,vfe,vgf,vgo,vgv,vgx,vgy,vhb,vhr,vhv,vhy,vim,viv,vix,vjb,vjt,vke,vlh,vlj,vln,vlv,vmj,vml,vmz,vna,voc,vpb,vph,vpw,vrj,vrn,vrv,vsh,vso,vtd,vtv,vud,vui,vuj,vuk,vum,vun,vvi,vwh,vwl,vxh,vxi,vxv,vyv,vzl,vzm,vzo,vzp,wau,wav,wba,wbi,wbs,wbu,wcd,wcy,wda,wdr,wew,wfa,wfg,wfj,wfk,wfm,wfw,wfy,wgf,wgg,wgi,wgm,wgs,wgv,wha,wiz,wjd,wjt,wju,wjv,wke,wko,wkx,wli,wlk,wlt,wlv,wlz,wma,wmb,wmf,wmg,wmj,wmk,wmp,wms,wmv,wmw,wnp,wnv,wnw,woi,woj,wou,wov,woy,wpa,wpb,wpc,wpf,wpl,wpo,wpp,wps,wpt,wpw,wpx,wqb,wqr,wrl,wry,wsb,wsf,wsn,wsp,wsv,wtd,wti,wtr,wtv,wtw,wuk,wun,wva,wvb,wve,wvi,wvy,wwd,wwn,wwv,wwy,wxg,wxh,wxi,wxj,wxr,wxu,wxw,wxz,wyy,wzg,wzi,wzm,wzp,wzp,wzq,wzu,wzw,wzx,xag,xar,xaz,xbi,xbm,xbu,xbw,xcd,xch,xcq,xdr,xds,xfy,xgm,xgq,xhb,xhh,xie,xih,xii,xij,xik,xio,xiq,xiu,xiv,xjw,xkg,xkk,xkl,xkq,xku,xlh,xlv,xlw,xlx,xmj,xob,xoi,xoy,xpb,xph,xpk,xpn,xpw,xqx,xrj,xrl,xru,xrw,xsb,xsd,xsh,xsl,xtd,xtj,xuc,xui,xuo,xvj,xwg,xwj,xwp,xwt,xwu,xwv,xwy,xxb,xxi,xyu,xyy,xzb,xzf,xzi,xzj,xzl,xzo,xzp,xzv,yaa,yag,yai,yam,ych,ydc,yeb,yej,yeq,yga,ygn,ygq,ygr,ygv,ygz,yha,yhb,yhy,yia,yib,yik,yiu,yiy,yjt,yjw,yjy,yku,yle,ylv,ymp,yne,ynn,ynr,yoa,yob,yoh,yok,ypn,ypw,ypx,ypy,yqx,yrl,yrm,yru,ysu,yte,ytj,ytm,ytq,ytr,ytv,yuh,yui,yuj,yvj,yvm,yvn,ywh,ywt,yxh,yyd,yyg,yyi,yyp,yyr,yyu,yyz,yza,yzy,zab,zac,zay,zbg,zbi,zbj,zbw,zcd,zcx,zdg,zdm,zds,zeb,zei,zeu,zfz,zgd,zgg,zgm,zgq,zgv,zhp,zhr,zhy,zib,zix,ziy,ziz,zjt,zju,zjw,zke,zkf,zky,zlv,zmb,zmj,zmt,zmv,zna,znw,znz,zou,zpa,zpx,zqx,zri,zrj,zro,zrp,zru,zrw,zsb,zsh,ztv,zue,zun,zvj,zvm,zvn,zvu,zwt,zwv,zxv,zzg,zzj,zzk,zzp,zzr

But before we continue, a little background. In order to improve the level of realism of images (through “promptHero”), I analyzed several hundred prompts and found a rather strange expression containing an array of acronyms, the meaning of which is still a mystery to me

(gpo;aca;aer;api;fla;gcx;hll;hnj;gpc;fii;fey;fbv;evg;iew;ifl;igh;iwj;iwp;ixb;ixe;ixz;jaf;jbm;jfb;jsf ;jyk;kmz;ksh;kxg;kzg;lbv;zac;yle;zmj;szw;uiw;vfe;par;pdl;qdl;mbo;mtd;gor;bhz;dit;frw;fnaf;bmo;zbi:0.9 )

At the beginning, I mistakenly decided that this was a list of abbreviations and acronyms of technical terms, where I even managed to “decipher” some of them by succumbing to “cognitive distortion”:

"..,

aer: artist error,

fla: flat image,

pdl: poor detail,

bmo: bad model output,

zbi: z-buffer issue,

..,"

In fact, these abbreviations are associated with anime styles, where the exclusion (in a negative prompt) of the desired selection makes it possible to generate photorealistic images. Here is a clear example of styles indicated by tokens: https://files.catbox.moe/yhg6tr.jpg

Since I don’t have the free time to analyze 1970 tokens, for now, we will rely on a selection (for a negative prompt) of an unknown author. Where, without this expression, the result "Everclear PNY by Zovya V3.0"🦄 still has attributes that separate the resulting image from the photos:

However, we just need to add a third line to the negative prompt and "miracle", the level of photorealism of our images increases significantly!

/1/(score_6;score_5;score_4;worst quality;low quality;normal quality:1.5),(different from awarded photo;without optical_physics/light_effects/DoF:1.8),(paintings;source_furry;source_pony;pony),
/2/(artificial;semirealtic;surrealtic;illustration;painting;cgi;3d;2d;render;game;sketch;cartoon;drawing;retouch;contours;anime:1.6),(lowres;monochrome;grayscale),(jpeg artifacts;blurry;low resolution;poorly drawn:1.5),(out of view;out of frame;close up;cropped:1.5),(aliasing:1.5), 
/3/(gpo;aca;aer;api;fla;gcx;hll;hnj;gpc;fii;fey;fbv;evg;iew;ifl;igh;iwj;iwp;ixb;ixe;ixz;jaf;jbm;jfb;jsf;jyk;kmz;ksh;kxg;kzg;lbv;zac;yle;zmj;szw;uiw;vfe;par;pdl;qdl;mbo;mtd;gor;bhz;dit;frw;fnaf;bmo;zbi:0.9),(text, words),
/4/(blurred/low_detailed/poorly_drawn/not_real or bad/ugly/amputated/mutated/deformed or extra/less body_parts/hands;legs;fingers;nails;eyes;teeth:1.5),(low detailed at distance), (minor,medium,small,tiny breasts sizes,dark skinned woman:1.5),(EasyNegative),old, fat,

The results is here: https://civitai.com/posts/2824209

And now the trash content! ))) I extracted 970 characters from the array with tokens (that’s 242 random tokens) and completely replaced the contents of the negative prompt with this “Abra-Kadabra” and what do you think?

To my surprise, the result turned out to be relatively “photorealistic”, where the women are more reminiscent of the works of monumentalism artists like A. Deineka or E. Abbey, the rough, masculine bodies of women:

(wve,wvi,wvy,wwd,wwn,wwv,wwy,wxg,wxh,wxi,wxj,wxr,wxu,wxw,wxz,wyy,wzg,wzi,wzm,wzp,wzp,wzq,wzu,wzw,wzx,xag,xar,xaz,xbi,xbm,xbu,xbw,xcd,xch,xcq,xdr,xds,xfy,xgm,xgq,xhb,xhh,xie,xih,xii,xij,xik,xio,xiq,xiu,xiv,xjw,xkg,xkk,xkl,xkq,xku,xlh,xlv,xlw,xlx,xmj,xob,xoi,xoy,xpb,xph,xpk,xpn,xpw,xqx,xrj,xrl,xru,xrw,xsb,xsd,xsh,xsl,xtd,xtj,xuc,xui,xuo,xvj,xwg,xwj,xwp,xwt,xwu,xwv,xwy,xxb,xxi,xyu,xyy,xzb,xzf,xzi,xzj,xzl,xzo,xzp,xzv,yaa,yag,yai,yam,ych,ydc,yeb,yej,yeq,yga,ygn,ygq,ygr,ygv,ygz,yha,yhb,yhy,yia,yib,yik,yiu,yiy,yjt,yjw,yjy,yku,yle,ylv,ymp,yne,ynn,ynr,yoa,yob,yoh,yok,ypn,ypw,ypx,ypy,yqx,yrl,yrm,yru,ysu,yte,ytj,ytm,ytq,ytr,ytv,yuh,yui,yuj,yvj,yvm,yvn,ywh,ywt,yxh,yyd,yyg,yyi,yyp,yyr,yyu,yyz,yza,yzy,zab,zac,zay,zbg,zbi,zbj,zbw,zcd,zcx,zdg,zdm,zds,zeb,zei,zeu,zfz,zgd,zgg,zgm,zgq,zgv,zhp,zhr,zhy,zib,zix,ziy,ziz,zjt,zju,zjw,zke,zkf,zky,zlv,zmb,zmj,zmt,zmv,zna,znw,znz,zou,zpa,zpx,zqx,zri,zrj,zro,zrp,zru,zrw,zsb,zsh,ztv,zue,zun,zvj,zvm,zvn,zvu,zwt,zwv,zxv,zzg,zzj,zzk,zzp,zzr:1.5)

Conclusion: further research work is needed to create the best negative prompt expression from the Pony token array! :)

<br/>

<br/>

5.4. The difference in the interpretation of SD corrections in unsystematic prompts in comparison with the LSD methodology (Layer Structural Defragmentation)

To begin with, we should analyze the systemless organization of a prompt expression. We are interested in the question of why sometimes adjustments that seem intuitively simple give unpredictable results and why does this not happen in the case of LSD methodology?

Let's go back to the Spiderman example. So, the evil boss was pleased with Peter's work, but being a villain, he decided to continue torturing our arachno-primate. Therefore, J. Jonah Jameson, referring to the points in the technical specifications with adjustments, demanded that Parker change the shape of the breasts in accordance with common forms "Y" with the obligatory condition that breast size "X" remains unchanged! Accordingly, when modifying the variable "Y", changing the variable "X" is unacceptable.

Will Peter Parker be able to make adjustments along the way without disturbing the main details of the artistic composition? Let's find out! :)

So, here is my old prompt without LSD methodology, which does not have a logically ordered structure like ~90% of prompts on "Civiai". Let's hope everything will change from now! :)

Unsystematized prompt:

"(absurdly busty 1girl,close up,skinny face|prefect pretty freckles face, (heavy breathing, absurdly gigantic breasts:1.5), extremely busty smiling Elsa of Arendelle, nude nsfw extremely gigantic breasts, cinematic lights:1.4), (highly detailed wet skin:1.5), (close-up hyper realistic photography:1.4), (16k pov photo of nude absurdly busty girl:1.5), (a photorealistic photography:1.4), (hyperrealistic photography:1.5), (a nude absurdly busty freckles woman extremely sexy posing, dynamic pose:1.4), digital art, by Randy Vargas, Artstation, fantasy art, long blond braided hair, nsfw extremely gigantic breasts, photo of Elsa of Arendelle, (really large bust, extremely busty:1.9), wow, (nude extremely protruding nipples:1.5), (perfect detailed hands:1.9) (waist level portrait:1.8). realistic, photographic, hyper-detailed, <lora:LowRA:0.6> (LowRA dramatic side lighting:1.5), <lora:epiNoiseoffset_v2:1> <lora:bigguns_v121:0.7>, (detailed stunning eyes)++, blue eyes, perfect face, (puffy pink nipple:1.4), (very large pink areola:1.4), (very curvy, extremely busty body:1.5), (extremely gigantic breasts:1.5), (naked wide absurdly gigantic breasts:1.5), (bright glare of environment clearly reflected from all shaded surfaces of woman's wet skin:1.8), (highly detailed blue extremely wide open eyes:1.4), (focus from below, digital art by Artgerm and Greg Rutkowski and Alphonse Mucha:1.4), dynamic posture, unreal engine, trending on ArtStation, cinestill 800 tungsten,"

Above, we see a prompt expression that contains an unsystematic conglomerate of duplicated variables, categorically similar variables and complex expressions with contradictory properties. As a result, such descriptions produce unstable results even with the slightest change in the prompt expression.

The problem is caused by the fact that any prompt (for example, “big tits”) is essentially an array of associated datasets. Where, the values of datasets can be overlapped, ignored, mixed or amplified in a way that is unpredictable for the user without the use of LSDM

Thus, it becomes obvious to us that complex prompt expressions should be written in such a way that the datasets describe the general underlying concept without leading to conflicts.

Below, a prompt expression within the framework of the LSD methodology, where the properties of the “breast” are derived in one expression, thereby creating a common array of associations. Where the effectiveness is verified through the experiments given.

(gigantic breasts;huge breasts;enormous lactating tits;The best enormous breasts and breast shape is hemispherical:1.4)

Now, we can once again verify the effectiveness of the proposed approach by performing the “evil boss” adjustments, where to achieve the result we just need to change the value of the “breast shape is …” variable:

hemispherical

Pendulous

saggy

shallow

slender

Conclusion: If the model has not been trained on a specific category of abstractions, then it is physically impossible to achieve an arbitrary result, however, additional LORA models can change the situation: https://stable-diffusion-art.com/lora/

Result here: https://civitai.com/user/boobinator/posts

<br/>

<br/>

5.5. Conclusions and summary of the results obtained

Friends, we can congratulate ourselves, we have just completed the technical task, confirming the effectiveness of the proposed methodology. Perhaps J. Jonah Jameson could praise Parker. :)

However, let's check how valid this interpretation is! :) What exactly is the imperial evidence in favor of the conclusion? Let's state the facts:

✔ Achieving the desired result in fewer iterations

✔ A method for determining the correlates of a particular abstraction.

✔ Create logical subsets to improve implementation efficiency.

✔ Achieve the desired result by changing the minimum number of variables.

Of course, sometimes the proposed approach may require additional analytical work, but in most cases, it is enough to add emphasis to new variables or change their order in case of conflicting meanings. :)

<br/>

<br/>

6. Prosocial motivation as the basis for the development of civilization and progress

According to my observations, some users (the minority) post images without prompts, apparently believing that they have some kind of “value”. As we saw above, value lies only in the ability to execute prompts expressions in accordance with detailed technical specifications and not in themselves. How justified is such an approach? Let's figure it out. :)

Let me give you an example. I’m a programmer and we have huge communities where we don’t keep technical solutions a secret, but happily share with each other, this is our strength, see: “stack overflow”, “codepen”, “GitHub” . The reason is simple: together, we achieve many times more than if we were social atoms without like-minded people, common goals and values. :)

People with “secret phobias” should pay attention to sociology. Within any of the fields of activity, to name a few: “software engineering, mechanical engineering, game development, science, medicine, cosmetology, art, crafts” - there are three main factors, professional development “D”, narrow specialization “L” and local population "P".

Now, if we apply a Gaussian distribution to the variables "D" in the population "P", we will get a distribution of skills (but which ones, between what exactly?). With “LD” the situation will be sad (in relation to stable diffusion) until we (as a community) create enough documentation and textbooks, the volume of which will be so capacious that mastering the specialty should take 4 years, I’m afraid that it will be impossible to speculate about any “professionalism” in prompt engineering. How to change this state of affairs?

Look, in relation to programming, any person involved in this activity has a roadmap, documentation and textbooks to go from junior to senior: https://roadmap.sh/ The reason is that there are many in-demand specialties that can provide a useful result. Here are the clear benefits of prosocial behavior! :)

In the case of "prompt engineering", we can say that at the moment the community mainly consists of amateurs who can perform some specific tasks, but few are high-class professionals. For comparison, we currently have such a “thoughtful” (so much so that it can be studied in 1 weeks) roadmap for prompt engineering: https://roadmap.sh/prompt-engineering The time has come to change this, share your experience, create textbooks and roadmaps so that “prompt engineering” becomes a much-needed specialty and a specialist in this field could have the opportunity to scale and career growth through simple diligence, this is how everything works everywhere! :)

<br/>

<br/>

7. How did I come up with this methodology, will there be a continuation?

First of all, I have never used LSD (as a chemical), the acronym in the title has humorous connotations, so I further suggest using the full name of the technique “LSDM” (Layer Structural Defragmentation Methodology). :)

The most important factor that allowed me to write this article was a specific “cognitive complexity” (we are not born with it, but we build it up). Where my current knowledge includes: 3D packages (blender, 3ds max, maya), and webGL frameworks (tree.js, babylon.js) + I am a full stacker and data scientist; Also, I will be a member of the "society of skeptics"; everything described above gave me the necessary predictors to create this methodology. Therefore, go ahead, study, be many times more competent than me! :)

But the most important factor is scientific methodology, without which it is impossible for a person to succeed in any intellectual activity. You should remember that only those hypotheses that have been tested by practice can be considered theories with predictive power.

As for continuing to create educational materials, I plan to write several more articles. But due to busyness, this will not happen soon (I’m studying intensively now + I’m stuck at work). I'll probably continue during the summer holidays. :)

<br/>

<br/

8. Wishes to the future prompt engineer

My dear padawans, I hope that I was able to present the material as meaningfully as possible, simply and humorously revealing the operation principles of the LSD methodology and providing all the necessary imperial evidence of the effectiveness of the approach. :)

Now you can consider yourself as LSD Jedi, but I ask you not to go over to the dark side of the force! Never use the harmful chemical LSD, and associate this acronym only with Layer Structural Defragmentation! :)

May the boobs ( )( ) be with you! :)

92

Comments