READ "ABOUT THIS VERSION" for GEN info -->
1152x1752 in 4 seconds on a 3090!
LCM fused with SDXL Turbo has arrived.
LCM UPDATE: 1-2 seconds per generations! READ "ABOUT THIS VERSION" -->
UPDATE VAE with FP16 fix for better details: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix
V.5 is up! Better everything - enjoy!
V.4 is up! Better photorealism...AGAIN!
V.3 wants DPM+ 3M SDE and V3 also has a new better license!
Image compatibility between COMFYUI and A1111 - same image everywhere! This breaks seeds and you will not be able to get same image as me without these changes! Read more here: https://github.com/Mikubill/sd-webui-controlnet/discussions/2039
↓ Settings and recommendations down below ↓
Easy and complex at the same time, this model is very versatile in the right hands. Better photorealism in XL is here.
⋅ ⊣ Why?
Look no further. The era of sharp versatile models for XL is here, in big part because of this awesome community. This is a model accumulating upon the knowledge provided by the SDXL 1.0 model and the incredible base it has given us - thanks to the team over at StabilityAI!
But as many have noted, there is always room for improvement. This model aims to take XL generations to a new plateau on which to build further and generate some really cool images along the way - be it photographs or digital art.
Realities Edge (RE) stabilizes some of the weakest spots of SDXL 1.0 base, namely details and lack of texture. Sometimes XL base produced patches of blurriness mixed with in focus parts and to add, thin people and a little bit skewed anatomy. The diversity and range of faces and ethnicities also left a lot to be desired but is a great leap forward since the days of 1.5. Lastly, the art in all its different styles and forms. SDXL base is far more capable than it's predecessors and a huge upgrade for us to play with, but there is some art styles the model still struggles with. The additions made to RE in this regard is big.
SDXL was released to all of us here. Now we build.
⋅ ⊣ What?
A methodical chaoswarp* of the best available models on Civitai combined with custom, unreleased, XL Loras I've been training these past weeks have resulted in this model. It is capable of photorealism and natural photography but that just scratches the surface. RE can do NSFW and has great anatomy information paired with Loras for better skin-texture and more realistic faces, eyes and mouths. The hole slew of anatomical corrections have been mostly fixed for the ladies and hands have also improved a lot giving way to staggering realism. The men still have room for improvements, but with this as a base I think that improvement will be here quick.
Realities Edge is first and foremost an art machine. Bombastic oil paintings, atmospheric art photography, futuristic 3D, all forms of digital art and anything in-between. If it's been expressed in art in some period of the human history, RE should be able to handle it or at least give you a great base to train your own stuff with! Loras are more accessible than ever with SDXL being the easiest plattform to train on (but a hard one to master 😉).
RE has a wide array of art-styles to choose from and most of them come out sharp and vibrant ready for further tweaking if and upscaling if needed. Illustrations, vector, oil paintings, watercolor, vintage cameras like Kodak and Ektachrome; product photography, concept art, macro, portraits, animals, comics, characters, Western-style, Eastern-style, medieval, RPGs like D&D, mechanical parts, aliens and all of these can be combined, twisted around, merged and re-synthesized into whatever concoction you can imagine.
⋅ ⊣ How?
Leaning heavily on the fantastic community model maker socalguitarist 's XL models infused with high volumes of my own acidic Loras burning off bad quality, low resolution, wonky eyes and airbrushed skin texture and adding a needed boost to creativity and range of style that the base model from StabilityAI lacks. Together with the training of the community merged within, this model shines.
There have been around 17 iterations before arriving at this one. Models have been merged with regular Checkpoint merging in both Weighted sum and Add difference but the heavy lifting was done in MBW (block merging). The many Loras were trained with Kohyaa-ss with dim rank of 256 for the sharpest detail and highest quality possible, at the cost of diskspace.
Speaking of which. Total footprint for model is ~170GB.
The model is capable of producing some basic anime, but don't despair, during the process there was an anime Lora born from the potion mixing- scheduled for release at the end of August. But that's for another post.
NO REFINER NEEDED
⋅ ⊣ Capabilities and recommendations:
Photorealism, 3D, 2.5D, Illustrations, Photomanipulation, Portraits and much more
Works very well with Loras - both as a base to train on and for rendering
Excels at both types of CLIP prompting. Be it maximalist OpenAI style prompts or minimalist story driven LAION prompts (written in a more natural language without constant commas).
Great lighting and shines with easy short prompts and aggressive (but short) negative prompts.
Very low risk of burned generations even on higher CFG - recommend 5.5-15
Responds amazingly to hires.fix with just a scaling of 1.0-1.5 and beyond. I like doing it with no scaling at all, but just letting it run through with a sharp upscaler and less steps. If you have the VRAM for it, push the scaling higher, go nuts!
Favorite resolution ranges are 768x1344 and 1024x1296. Works good for landscapes on even bigger resolutions. Also works with anamorphic lenses in resolutions of 1920x816 or the likes. Test what works best for you.
DPM+ 3M SDE Karras recommended but always test your favorite!
All the img2img modes work really well and balancing a low CFG with a higher-than-average Denoising Strength will produce a sharp and clear upscale full of interesting details, using regular SD upscale. I wonder what you can do with Ultimate SD upscale?
Likes Clip Skip 1-4. I frequently use 2.
Knows about some celebrities - good LoRA base!
Use with ToMe (token merging) in A1111 (I'm sure it's implemented in Comfy as well) for a much faster SDXL generation time - changes seeds though!
* = The word "chaoswarp" is defined by large amounts of coffee and lots of nights spent waiting by the computer dreaming up ever more complex prompts, folding styles, tales and characters into elaborate images. In the haze of the late hours, ideas and experiments unfold that forego with such haste that having any recollection of the exact steps taken is, at this time, impossible.
"Like ReV and RV but for XL - amazeballs!"
- some dude on the internet
⋅ ⊣ tack och på återseende ⊢ ⋅