Sign In

RobMix

217
2.5k
78
Verified:
SafeTensor
Type
Checkpoint Merge
Stats
1,722
Reviews
Published
Jun 24, 2024
Base Model
SDXL 1.0
Usage Tips
Clip Skip: 3
Hash
AutoV2
9521FC034D

A realistic model, built for quality and creative range.

ze·nith

noun. the time at which something is most powerful or successful.

RobMix Zenith is the next iteration on my series of artisan photorealistic model merges with unnecessarily superlative version names.

This merge is like a classic martini—simple, with just a couple of ingredients, but mixed with precision and handled with care. It blends RobMix Evolution with Corcel's fantastic Mobius base model, with block-by-block tuning to draw out the best of the RobMix style with the best of Mobius' quality and creativity.

Note: Mobius requires a clip skip of -3. This merge doesn't, but you can get some interesting results by experimenting with clip skip values between -1 and -3.

Like my other merges, it's geared toward a photographic style with an emphasis on balancing realism with creativity, but also has some gems with illustrated or artistic styles if you prompt for them.

This model works great as a plug-and-play model out of the box, but it shines with some workflow optimizations. I've made some suggestions at the end of this post, and you can try them out with my workflows here.

In the sample images, second pass is a 1.5x latent upscale, 0.3 to 0.4 denoise, 40 steps. Everything was generated in Comfy.

  • Sampler: DPM++ 3M SDE

  • Scheduler: AlignYourSteps

  • CFG: 3-4 (or use Automatic CFG)

  • Steps: 30-40

  • Clip Skip: -2 or -3

  • Aspect Ratio: 1:1, 2:3, 3:4, 16:9, 21:9, vertical or horizontal

Advanced Settings

  • FreeU v2

    • b1: 1.05

    • b2: 1.08

    • s1: 0.95

    • s2: 0.8

  • Perturbed Attention Guidance

    • Scale: 0.5-1

    • Adaptive Scale: 0.1

  • Automatic CFG

How to prompt this model

This model works best with natural language style prompting. I've gotten the very best results by separating CLIP-G and CLIP-L, using natural language in CLIP-G and SD 1.5-style keyword based prompting in CLIP-L.

I've created a custom GPT to help with this. By default, it will generate CLIP-G style prompts, but you can optionally ask it for CLIP-L and/or T5 style prompts. The GPT follows my Prompt Pyramid style of prompting, which may not be the best, but it's how I do things.

Example CLIP-G*

A high-resolution, atmospheric photograph capturing a serene sunset over a mountainous landscape. The composition features a lone tree standing on a hillside, silhouetted against the warm, golden light of the setting sun. The sky is a gradient of soft oranges and yellows, blending into the horizon. Rays of sunlight stretch across the scene, creating long shadows and adding depth to the rolling hills. The overall mood is peaceful and contemplative, with a harmonious balance between light and shadow. The exposure is perfectly balanced, emphasizing the natural beauty and tranquility of the landscape.

* If your prompt exceeds 75 tokens, be sure to properly handle concatenation.

Example CLIP-L

High-resolution photograph, young woman, leaning out of vintage red car window, arms crossed on door, head tilted, calm contemplative expression, curious gaze, engaging connection, framed upper body, smooth vintage vehicle lines, nostalgic feel, softly blurred background, serene reflective mood, muted warm tones, timeless quality.