Sign In

ZootVision - Eta

81
1.5k
23
Verified:
SafeTensor
Type
Checkpoint Trained
Stats
289
Reviews
Published
Aug 15, 2024
Base Model
SD 1.5
Training
Steps: 117,992
Epochs: 687
Training Images
Download
Hash
AutoV2
B9B42E3949

What is this?

I would describe it like so: an abnormally versatile SD 1.5 model with extensive custom training done exclusively at 1024px and higher (thanks to "bucketing"). Built up in a clean, additive, iterative fashion on an ongoing basis thanks to CivitAI's handy online Lora trainer. Can do everything from pretty landscapes to hardcore booru-tag based NFSW in pretty much any style. Not specifically just an anime, realistic, or semirealistic checkpoint, rather moreso whichever of those you want it to be at any given time. All showcase images are direct generations made without any use of detailing or upscaling whatsoever (i.e. you should treat this like an XL model basically when using it), and include full metadata.

How do I use it?

You can use either natural language or booru tags (with spaces, not underscores). I tend to use both simultaneously, as in mostly coherent sentences but with many of the words and phrases being specific tags that actually exist. See the showcase gallery for a variety of examples. In terms of resolution, it is at the very least completely pointless in my opinion to ever go any lower than 768x768 with this model (as 100% of my training is done at 1024px without downscaling or cropping anything).

Personally, I do not ever generate lower than 1024x768 or 768x1024 with this, and more often actually do 1216x832 and 832x1216 when it comes to non-square-format images. For square format I personally stick to 1024x1024. Again, you can download my showcase images at their original resolution with full metadata to get a better idea of what this thing can do, as it is also trained on some less common "exotic" aspect ratios / resolutions too.

Also note that if you're prompting for 2D-style images, this model DOES recognize a large selection of "by whoever" artist tags (some stronger than others), so if there's one you have in mind just try it.

Tip: generally speaking, SDE samplers provide better results with this model if you're going for realism. I personally am a big fan of DPM++ 3M SDE GPU Exponential, at around 4.0 - 4.5 CFG. For anything less realistic, however, you may also want to simply try Euler Ancestral (or very occasionally DPM++ 2M Karras) at around CFG 7.0.

Do masterpiece, best quality, high quality, worst quality, and so on exist in this model?

Yes, but their impact on the image is much smaller if your overall prompt is for realism or semirealism, they have the most noticeable impact specifically on 2D-style images. detailed background and simple background specifically however DO both have the impact you'd expect on all types of images, generally speaking.

V7.0 Eta Details:

Better realism, and prompt adherence should be I think the best it's ever been. Really happy with this version. VAE baked in as always.

V6.5 Zeta Plus Details:

It's not quite what Zootvision V7 Eta is intended to be, yet. But it makes some nice, perhaps subtle, improvements. I tried to stress the actual depth of the model in the showcase gallery images this time, a bit more. VAE is baked in as always.

V6.0 Zeta Details:

Improved basically everything TBH. Did all the stuff I talked about in the comments, and a bunch more. Made some pretty weird showcase gens just to kinda show off what this thing can actually do a bit more, lol. VAE is baked as always. Also don't forget that this model does in fact know a very large amount of by whoever Booru-format artist tags, it's not only the specific ones you've seen me mention before!

V5.0 Epsilon Details:

Trained for an additional 10,000 steps on a variety of subjects (all of photorealism, NSFW, and anime have been at least somewhat refined) against v4.0 Delta. This version also introduces an Ideogram style dataset, which can be triggered by using 'by ideogram' in any prompt. See the showcase gallery for some examples. I think this is a pretty solid improvement over Delta, hope you enjoy it! VAE is baked in as always.

V4.0 Delta Details:

Two additional datasets merged in (one for further enhancement of photographic images of people and places, one for some experimental "tricky prompt" rich captioning stuff), both trained on V3.0 Gamma for a combined total of 9040 steps. VAE is baked in as always. All data in the new photographic dataset was tagged with photo \(medium\) in order to build on top of the model's existing understanding of that tag. This is definitely the best version yet, hope you enjoy it!

V3.0 Gamma Details:

1000-image "aesthetic" dataset (trained for 10,000 steps on V2.0 Beta) merged in. This dataset can be optionally strengthened by using the phrase very aesthetic anywhere in your prompt. This version has a VAE already baked in, as always.

V2.0 Beta Details:

Merged with 1000-image "NSFW Enhancer" dataset (trained for 10,000 steps on V1.0 Alpha). All images were at least 1024px on at least one side, up to a maximum of 1216 (for XL-style 832x1216 portrait / 1216x832 landscape images, of which there were a fair number).

V1.0 Alpha details:

My (incomplete) attempt at a truly general-purpose high-resolution-focused SD 1.5 model, in the sense of anything from pretty landscapes to hardcore booru-tag based NSFW porn.

Uploading to CivitAI in the current state basically for the sole purpose of using their Lora trainer for a few more 1000-image datasets I need to get trained and merged into this thing. Feel free to try it out regardless if you like (it know many characters, see e.g. Jinx in the showcase), however expect relatively different results from later / the final version.

General (always relevant) details:

DO NOT blindly assume that Clip Skip 2 is always "correct" with this model, it is not really traditionally NAI-derived at all. Really I'd moreso recommend just trying either Clip Skip 1 or 2 if you've found a particular seed that you mostly like but isn't quite "there" for a given prompt, as in my testing both give good results under different circumstances.