Type | |
Stats | 649 |
Reviews | (46) |
Published | Aug 9, 2024 |
Base Model | |
Usage Tips | Clip Skip: 1 |
Trigger Words | RAW photo of Analog photo of Photo of |
Hash | AutoV2 786CCFEE2F |
Before to use
You have to know how works Stable Diffusion. I recommend using Automatic1111 like an interface to launch the model.
This is a model based on SD 1.5 model, so you have to consider that it is not perfect. I had to make so much testing before to arrive to a stable generation. I will enhance the model when a better base model arrives (like SD XL new one).
This is a Checkpoint dataset.
It is a merge, so consider that it could sometimes generate NSFW images. Add in the negative prompt just "nudes".
I recommend you to follow me on my instagram account, where I will explain about AI image generation: https://www.instagram.com/eddiemauro.design/
Intro
FRESH PHOTO (realism eddiemauro-mix) CHECKPOINT: Hi, I’m a product and car designer, and I’m so excited to test with AI, I think is a good tool for designing.
v1: Product of compiling realistic models, it performs well in different types of photography. My idea is to create a "general model" to use it for realistic image creation.
v1.5: Better prompt and more consistency. Colors more vivid. Better ethnic diversity. Better details.
v2.0: More details, more realism, more resolution. Better consistency with shapes. Warm filter is less. Better ethnic diversity representation. Generally is better than 1.5, but sometimes for some specific representations the 1.5 is better.
v2.5 (+inpainting): Better than 2 overall.
v2.5 LCM: Better than 2 overall. Darker and with more details than 2.5 normal. You can use it with Euler-a Normal, or LCM normal. CFG 1-2 and Steps 5-20. Note: image examples here were created with 1.5 CFG, civitai metadata here is not correct.
If you want to support my work and help me to upload more models (with better quality), you can do it by entering here and donating, I would greatly appreciate it: https://ko-fi.com/eddiemauro
Installation
I use Automatic1111, the best UI for Stable Diffusion image generation, so I recommend you to install locally or use it online with some Colab or other hosting. You can find online instructions or videos to do that. If you are going to install locally, you can watch this tutorial online and I recommend you to have at least a 6-8 Gb of VRAM Graphic Card (nvidia) to have a stable interface and launch with “Microsoft Edge” because you will have problems on “Google Chrome”. Try also to install “medvram” or “lowvram” options besides “xformers” (search online how to).
You have to install this Checkpoint model to use.
Please for image creation you have to follow all my recommendations, if you don't, it is impossible to generate a good image quality. Also, you have to consider that from today AI image generation is not so consistent and perfect, you have to invest time to get it and make plenty of tests.
Recommendations for image generation
Activation token/caption: Put in the first part of the prompt: "RAW photo of" or "Analog photo of", to put more realism into the image, but is not mandatory.
Prompting recommendation: Inside prompt you can use those words that will enhance the image generation: in the positive space, “Photorealistic, Hyperrealistic, Hyperdetailed, detailed skin, soft lighting, subsurface scattering, realistic, masterpiece, best quality, ultra realistic, 8k, Intricate, High Detail, film photography, soft focus”; in the negative space: “((nsfw)), ((asian)), Japanese, Korean, Chinese, ((disfigured)), ((deformed)), ((extra limbs)), (((duplicate))), ((morbid)), ((mutilated)), out of frame, extra fingers, mutated hands, poorly drawn eyes, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), cloned face, body out of frame, out of frame, bad anatomy, gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), (fused fingers), (too many fingers), (((long neck))), tiling, mutated, cross-eye, canvas frame, frame, cartoon, 3d, weird colors, blurry, cgi, 3d, render, sketch, cartoon, drawing, anime, cropped”. You can also watch the image metadata of example images here and simulate the prompt. To capture people's ethnic groups, it is recommended to use the name generation method, selecting the ethnic group and copying only a first and last name in this webpage.
Prompting recommendation (v2.5): Be simple, just put "RAW photo of", "Photo of", or "Analog photo of": in the positive space. In the negative space: "nudes, asian, worst quality, normal quality, bad quallity, text, artifacts, bad eyes, strabismus, deformed, cartoon, render". It generates mostly asian people looking, but if you want to avoid this, put "asian" in the negative.
Textual inversion/embedding or Lora tool recommended: I consider that “EasyNegative” is one best of textual inversion for negative prompt space, you should use it. Download here and install it, putting the file inside “embeddings”. You can use it also “Detail Tweaker” to put more details of the image, for that, you have to download from here, install like a Lora and use it inside positive prompt with a value of “1”. Use it when you realize the checkpoint used has so many details, but not when you see that minimalism is on it. You can use the "detail tweaker" just in img2img mode, after batch creation. You can use another Lora's like "Epi noiseoffset" that will increase contrast.
Textual inversion/embedding or Lora tool recommended (v2.5): Be simple, it is not necessary to use embeddings or complex words. In the postive you can use the “Detail Tweaker” Lora, just that, if you find that the image needs more details.
VAE: Mostly it is recommended to use the “vae-ft-mse-840000-ema-pruned” Stable Diffusion standard. The v1-5 and 2 the VAE is baked. The photo style has a subtle hint of warmth (yellow) in the image. Also, a soft touch of desaturation of the colors. In 1.5 version because of VAE baking, it tends to be more colorful.
Clip Skip: Use 1 for more realism. Use 2 just for experiment.
Steps and CFG: It is recommended to use Steps from “30-50” and CFG scale from “6-8”, the ideal is: steps 30, CFG 7. For next models, those values could change. There are times when I notice that the CFG Scale 9 setup with Sample Steps 40-50 is good. More than 10 with high sampling steps start to fail. For (v2.5), use 30 steps and 6 CFG. LCM: You can use it with Euler-a Normal, or LCM normal. CFG 1-2 and Steps 5-20.
Sampler: I use mostly “DPM++SDE Karras”. Euler tends to be simpler, but with less details. Make experimentation with other samplers if you like.
Batch: In txt2img try to put a value of 4 to generate more than 1 image and watch the generations. If you have a good graphic card, you can use “Batch size”, this will create at same time 4 images, increasing generation time; but if your computer cannot handle this, change to “Batch count” that will create 4 images in a row (not a same time), but generation time will be more.
Image aspect: Try to use these dimensions: 512x512, 768x512, 512x768, but even you can experiment with different. Don't generate bigger images because the style could be lost, if you want to create a bigger image, use hires.fix in txt2img mode, img2img increase method or Ultimate SD Upscale script extension + ControlNet, or just upscaling with GAN models.
Create bigger images: There are 4 different methods to create large images in Stable Diffusion, you can check online how to. For first method “txt2img hires.fix”, I recommend you to use upscale model called “4x-UltraSharp”, downloading here just “.pth” file, and then installing it, putting inside “ESRGAN” file. In hires.fix option put any “upscale by” value, and then with a “denoise strength” of “0.5-0.7”. For the second method, you have to select first the image generated in txt2img and then putting in img2img mode, increasing dimension at least “1,5 times” with a “denoise strength” from “0.3-0.5”. For the third method, you can use the same configuration of img2img, but activating “tile” mode of “ControlNet” extension and also the script of “Ultimate SD Upscale”, but for that, I recommend you to watch a tutorial here. For the last method, you have to pass the generated image in txt2img to “extras” and then select a GAN model and scale it, you can also use the “4x-UltraSharp", "4xNMKD-SIAX_200k" or "4xUniscaleV2-Moderate". For 2.5 use: “8x-NMKD-Superscale_150000_G” model.
Get more control of your creation: Use “ControlNet” extension to generate a more controlled shape of what you want, and even you can test it with sketches. Use “Scribble” or “Lineart” modes. For that, I recommend you to install this extension and then learn to how to use. There are plenty of online videos about it.
Copy prompt for image metadata: You can download my example images here and put it inside “PNG info” tab from Automatic1111
Example Prompting:
Positive prompt:
RAW photo of Gotzon Otxoa in casual clothes, little smile, small details, photorealistic, ultra-realistic photo, 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3, (masterpiece) <lora:add_detail:1>
Positive prompt v1.5-2:
RAW photo of Muirgheal MacCarrick with sweater, red hair, Photorealistic, Hyperrealistic, Hyperdetailed, detailed skin, soft lighting, subsurface scattering, realistic, masterpiece, best quality, ultra realistic, 8k, Intricate, High Detail, film photography, soft focus
Negative prompt (simple):
EasyNegative, ((nsfw)), ((asian)), Japanese, Korean, Chinese.
Negative prompt (complete):
((nsfw)), ((asian)), Japanese, Korean, Chinese, ((disfigured)), ((deformed)), ((extra limbs)), (((duplicate))), ((morbid)), ((mutilated)), out of frame, extra fingers, mutated hands, poorly drawn eyes, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), cloned face, body out of frame, out of frame, bad anatomy, gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), (fused fingers), (too many fingers), (((long neck))), tiling, mutated, cross-eye, canvas frame, frame, cartoon, 3d, weird colors, blurry, cgi, 3d, render, sketch, cartoon, drawing, anime, cropped, Easynegative
Steps: from 30-50 (Use DPM++SDE Karras, sometimes works with EulerA, but you will lose detail)
CFG scale: 7-9 (8 Ideal).
You can remove "((asian)), Japanese, Korean, Chinese", if your proposal is to go for that. After so many attempts and prompting, I found that just "EasyNegative" embedding is enough for negative, but more in v.1.5-2.0.
Example Prompting v2:
Positive prompt:
RAW photo of Isaura Ojeda in casual clothes, little smile, realistic, city streets <lora:General-Design\add_detail:0.7>
Negative prompt (simple):
nudes, asian, worst quality, normal quality, bad quallity, text, artifacts, bad eyes, strabismus, deformed, cartoon, render
Steps: from 30-50 (Use DPM++SDE Karras, sometimes works with EulerA, but you will lose detail)
CFG scale: 6-7. LCM: You can use it with Euler-a Normal, or LCM normal. CFG 1-2 and Steps 5-20.
You can remove "asian", if your intention is to go for that in face generation.
What comes for the future
I’m already trying to enhance the model. This was trained with 512 image aspect, so I will try with 768 (bigger one), and also other configurations (like changing captions, steps, epochs, etc.). If you like a better model of this version, try to keep supporting me on ko-fi, if there are more people supporting me, I can invest more time to train and enhance models, but if this doesn't happen I cannot.
I launched my first private model for my Ko-fi membership lv.1, called "eddiemauro scene" minimalistic scenery creation for rendering. If you want to access to private models, you can support me and subscribe to this membership. I will also start to upload here more models centered on product and car design.
License
Watch here a Stable Diffusion license link. In the case of this specific model, use it for experimentation. It is prohibited:
Upload this model to any server or public online site without my permission.
Share online this model without my permission, using my exact model with a different name or uploading this model and then run it on services that generate images for money.
Merge it with a checkpoint or a Lora, and then publish it or share online, just talk to me first. In the future,
Sell this model or merges using this model.
Supporting
You can follow me on my social networks. I will show my process and also design tips and tools. Also, you can check my webpage and in case of you need a design service, I work like a freelance.
https://www.facebook.com/eddiemauro.design
https://www.instagram.com/eddiemauro.design