home models images videos 3D Models articles comics challenges updates shop

Krea 2: simple gen workflow for high quality realism + lots of info & tips

Name: Krea 2: simple gen workflow for high quality realism + lots of info & tips
Rating: 5 (134 reviews)
Author: nsfwVariant

134

5.6k

115

Updated: Jul 11, 2026

tool

Download

1 variant available

Archive Other

krea_simple_v1.zip

14.14 KB

Verified: 16 days ago

Download (14.14 KB)

Details

Type

Workflows

Stats

5,553

Reviews

Very Positive

(134)

Published

Jul 2, 2026

Base Model

Krea 2

Hash

AutoV2

EDF3C10C43

default creator card background decoration

382

524

nsfwVariant

Joined Apr 14, 2024

What is this?

This is a simple workflow for generating high quality, realistic images at high resolution using Krea 2. There's also an optional full-turbo version of the workflow, which is not suitable for realism (or creativity) but is handy for some things. Below in this post there are also some tips & a lot of info about the model.

The sampler & lora settings in this workflow also improve the facial expressiveness of people from Krea 2. There's an explanation of how/why in the info section below. It's not perfect, but it's the best we can do until finetunes come out.

Otherwise, the sampler settings are geared towards sharpness and clarity - but you can introduce grain and other defects through prompting or with loras. It also does anime / digital artwork / whatever images well, but you may want to bypass the second sampler for that.

All the images attached to the post were generated directly with this workflow with no further editing.

Nodes & Models

Custom Nodes:

RES4LYF - A very popular set of samplers & schedulers, and some very helpful nodes. These are needed to get the best outputs, IMO.

RGTHREE - (Recommended) A popular set of helper nodes. If you don't want this you can just delete the seed generator and lora power loader nodes, then use the default comfy nodes instead. RES4LYF comes with seed generator & lora nodes as well, I just like RGTHREE's more.

ComfyUI GGUF - (Optional) Lets you load GGUF models, which for some reason ComfyUI still can't do natively. Once installed, you use the "Unet Loader (GGUF)" node to load the model. If you're not using any GGUF models you can just skip this.

Required Models:

Important Note: If you can, you should use the Int8 Convrot version of the model (unless you want higher quality using BF16). The Int8 Convrot model is almost 2x as fast to gen with, and is the same quality as FP8. Massive free speed boost. You will need to update your ComfyUI, support was only added early July 2026. You will also need an NVIDIA GPU, and CUDA version 130 or higher.

Main model: Krea2 RAW B16 / FP8 / Int8 Convrot | or | Krea2 RAW GGUFs - It is strongly recommended that you use the RAW main model with the turbo lora at 0.6 strength instead of the Turbo main model when making photo-real images. It gives WAY better results, and the only downside is that it takes a bit longer to gen. Gen times are already pretty short, so that's not a big deal.

The main workflow assumes you're using the RAW model with the turbo lora, and the settings will be very bad if you use the turbo main model instead.

Even the 'full_turbo' workflow still uses the raw model, seeing as you can just set the turbo lora to 1.0 strength and then it does pretty much the same thing as the turbo main model.

Turbo Lora: Rank 64 Turbo Lora - Using this with the RAW model at ~0.6 strength is better than using the Turbo model. The only real downside is speed. Even then, if you're in a hurry you still have the option of upping the strength to 1.0, which makes it just like the turbo model. Gen times are only 50% longer when it's at 0.6, so it's not really worth it to use full turbo IMO.

Anti-Censorship Lora: 2 Vector Bypass Lora - You should use this even if you're doing SFW stuff. More detail is below, but essentially this will massively improve prompt adherence, facial expressiveness, character detail, and numerous other things. There is no downside as long as your sampler settings are good (which this workflow takes care of for you). Do not use other bypass loras, they go too far or cause degradation of quality; this is the only one that works properly.

Text Encoder: Qwen3 VL 4B - Use the BF16 one if you can. Some people say text encoder quality doesn't matter much & to use a lower sized one, but it does matter and it affects quality.

If you're using a GGUF text encoder for some reason, swap out the "Load CLIP" node for a "ClipLoader (GGUF)" node.

VAE: Wan 2.1 FP32 VAE - This gives you sharper, clearer images than when using the Qwen Image VAE. There is no downside. It works because the Wan & Qwen Image VAEs are almost identical, and the FP32 precision improves the quality.

There is an alternative VAE you can use that's even sharper, but it has drawbacks so I've detailed it in the info section further down.

That's the end of the required info, you can stop here if you just want to use the workflow!

Info & Tips

Alternative Sharpening VAE

The Wan 2.1 Upscale2x VAE gives you even sharper images than the Wan FP32 VAE (it's VERY noticeable), but it sometimes introduces extra artifacts into the image and it also amplifies existing ones. It's up to you whether you think it's worth it or not, I personally think it's good for some images and bad for others, so I just output both and pick whichever turns out best.

Here's an example image using the normal Wan FP32 VAE: https://ibb.co/fGtZwdW8

And now the same image using the upscale2x VAE: https://ibb.co/RTj2DjVw

It's not in the workflow by default. To use it, you need to grab the ComfyUI VAE Utils node set and use the "VAE Decode (VAE Utils)" node instead of the regular VAE decoder. Then you also need to downscale your image by 50%, because this VAE decodes the image at 2x resolution (which is why it's so sharp). This pic shows what the setup should look like: https://ibb.co/XcrXmpr

What About Non-Realistic Images?

I still recommend using the raw model with the turbo lora at 0.6 strength for this. This is because the raw model is much more creative than the turbo model; you'll get better variety this way.

However, the second sampler is now optional because you may not need the extra detailing step anymore - you can just bypass it and it'll work fine. You can also change the scheduler in the first ksampler to sgm_uniform if you want an alternative look, but it's up to you. Just don't forget to change it back to beta if you're doing realism again ;)

Full Turbo Workflow?

You'll lose the creativity of the raw model by using it, but that may not matter to you at all depending on what you're doing. Or maybe you just need the speed.

As mentioned earlier, the full turbo workflow is set up for making non-realistic images, like anime / concept art / digital paintings. It only has one sampler because you don't need an additional detailing step, and you don't really need the benefits of a high-noise schedule either.

Euler/sgm_uniform is my general go-to for non-realistic images, and it holds up pretty well for Krea 2. I haven't tested it extensively though so don't take my word that it's the best sampler/scheduler or anything.

Otherwise, the only difference in the workflow is that the turbo lora is set to 1.0. You can also just use the turbo main model with the workflow and drop the turbo lora entirely, but then you're storing two main models for no real reason.

Krea 2's Facial Expression Problem: Censorship

This is the big one.

Basically, there's a lot of discussion going around about how Krea 2 doesn't do a very good job with facial expressions; characters lack expressiveness, and seem to have "dead eyes" a lot of the time. Smiles don't reach the eyes, that sort of thing. It's nearly impossible to make someone look angry, fierce, or anything more than mildly annoyed.

This is a very common problem with distilled models (i.e. turbo models), but in Krea's case it's mostly because of ridiculous censorship. The developers heavily censored Krea 2 against whatever content they arbitrarily decided was 'harmful', and in doing so they lobotomised their own model. It knows how to make an angry face, it just won't do it because it was collateral damage during the lobotomy.

You literally can't make people smile properly in most images with Krea 2 due to the censorship. That's not an exaggeration, try generating someone with a natural, realistic smile. Dumbest thing I've seen in years.

Luckily you can partially bypass the censorship using a simple lora, which you should use even if you're doing SFW stuff. It just makes better images, period. Some people say it also reduces the detail in the images, which is true - but this is actually just because you need to cook them a little longer. That is to say, if you have good sampler settings it's no problem. But it only works up to a point.

This workflow recommends using the bypass lora at 1.0 strength, but sometimes you need to go higher - even for SFW prompts - to get what you need. This isn't good because it degrades the image quality, but that's censorship for you. We'll need finetunes to properly decensor the model. This goes for SFW stuff too, remember - you will have a really hard time making a person look angry, even with the bypass on.

If you can't tell: I'm really annoyed about this and you should be too. The fact that you can't make someone look angry, happy, sad, etc completely ruins the model for a lot of applications. Literally unusable for so many things. All because they don't want your delicate little child brain to see blood or titties.
Luckily, finetuners and lora makers will probably save the day <3

You can also use pornographic loras at low strength (~0.4) to increase prompt adherence even for SFW prompts. Yes, you heard that right: the censorship in this model is so stupid that you can get better SFW facial expressions and general model performance by using porn loras. No joke, I genuinely have porn loras on for most of my SFW generations.

Here's an example where I'm trying to get a strong, fierce expression from a sprinter using the words "She's frowning and snarling with effort" in the prompt.

Using the filter bypass at 1.0 strength it straight up refuses: https://ibb.co/WWV64GzM

It's better (still not good) with the filter bypass at 6.0 strength, but notice the image quality has suffered: https://ibb.co/mCnHq1mF

And... here it is with the filter bypass at 1.0 strength and PORNOGRAPHIC LORAS enabled at ~0.5 strength: https://ibb.co/2YCV1j9Z

Notice that the quality of the one with porn loras hasn't degraded at all, while also adhering to the fierce expression prompt better. I had to cherry pick 10 gens each just to get the first and second pics (which didn't even do a good job), but the porn lora one I only needed 3 gens - and all three of them were usable.

If this isn't a great example of why censorship is stupid then I don't know what is. This model would be god-tier if it wasn't intentionally broken by the devs. We can only hope that finetuned checkpoints can bring back what it lost.

Another area of improvement; it turns out that the model gives slightly better facial expressiveness in the earlier high-noise stages of generation - which means faces are more expressive when images are undercooked. But undercooking your images isn't good of course, so you need to finish cooking them one way or another. This is where a dual sampler set up comes in handy. More on that below.

Lastly, the raw model with the turbo lora at 0.6 strength is a bit better at facial expressions too. All of these tips combined are very helpful, but you'll still struggle with very intense facial expressions for the foreseeable future. Still, at least we can make people smile now (you can't do that with the censorship).

The 2 Vector Bypass Lora

This lora bypasses the censorship in the model, and is superior in every way - even for SFW images. It does reduce the detail of the image, but you can get it back by using noisier sampler settings, and your images will ultimately look better. I recommend using a strength of 1.0 at all times. If you need more censorship unlocks, use more loras instead of increasing the strength of this one.

It works by amplifying two specific vectors during generation (hence the name). This lora is the minimum you need to bypass the censorship, and therefore it's the best one. All the other ones change more stuff than they need to or are way too strong, do not use them. Don't even use the 3 vector one by the same author, just use the 2 vector one.

But we should still be thankful to those who made the other inferior ones, because they did the hard work of figuring out how to bypass the crappy censorship in the model. All efforts for the open source community are appreciated <3

When you use this lora in combination with a high-noise dual sampler setup (like this workflow), you get great detail, great facial expressions, more prompt adherence, and better output variety. No downsides.

The Dual Sampler Setup

Why are we doing dual samplers? Two reasons! One reason is to help solve the facial expression problem, and the other is just to be able to tightly control the amount of detail in the image.

Our first ksampler is doing 6 steps of res_2s using the beta scheduler. Res_2s runs the equivalent of 2 steps, so this is sort of like doing 12 steps. The beta scheduler is very noisy so it makes more big, low-detail changes for more of the steps. Combined together, this sampler/scheduler/step combo undercooks your image on purpose. It doesn't add enough detail and stays in a smooth unfinished state.

That's really important, because at this point the facial expressiveness is better and the overall creativity of the model is higher too. Doing more steps, or doing the same number steps with a less noisy scheduler (like simple) will reduce facial expressiveness and be less creative. It's also harder to detail it from that point without overcooking your image.

If you're feeling adventurous you can also try euler + beta + 12 steps for the first sampler, which is really good as well and gives different results. I'm recommending res_2s + beta + 6 steps because I personally like it more, but you may like euler + beta + 12 steps more yourself.

Now the image is well structured, but it lacks detail.

That's where stage 2 comes in! For stage two, we're using a dense multi-step sampler called deis_3m but with an even noisier scheduler, bong_tangent. However, we're also doing 2 steps and at an extremely low denoise of 0.2. Because the sampler is 3-step (that's what the 3m means in the name) and we're doing 2 actual steps, it does a LOT of work - but only changes a small amount at a time due to the 0.2 denoise strength. What this means is we're adding a ton of detail to the image without interfering with the overall structure.

The end result is that stage 2 fills in all the detail & grit of the image without affecting the overall structure. Because we undercooked our first stage, this retains the facial expressiveness and variety while still adding plenty of detail to the image.

If you need even more detail, you can use the deis_4m sampler instead. deis_3m is enough most of the time, but you may find that in some cases deis_4m gives a more realistic amount of depth to the details. Just beware that using deis_4m for everything will often give you slightly overcooked images.

Let me know if you've discovered a better sampler setup! This is just the best I could find after around ~60 hours of A/B testing, I'm sure there are good alternatives out there waiting to be found.

Krea 2 and the Qwen VAE Halftone Grid

Sounds like the title of a harry potter book. Krea 2 has the same problem that all models which use the Qwen Image VAE have; there is a noticable halftone grid pattern, and that grid pattern heavily interferes with images generated by the model.

Every single model that uses the Qwen VAE has this problem. Qwen Image does it, Qwen Edit does it, Wan does it, Anima does it, and now Krea 2 does it.

The only reason you haven't noticed it with Wan is because you don't normally zoom in on videos. But you will notice it if you ever try generating a video with a beach or a grainy carpet.

The grid isn't that big of a deal if you're working in high res. It's really annoying at low res. Still, it's not a dealbreaker for most stuff. But the grid has another much worse effect: it interferes with small-grain patterns in images.

By 'small grain patterns' I mean things like sand at a beach, or a grainy carpet, or clothes that have visible weaving, basically anything that's very small/thin and repetitive
It happens whenever the grain size of a pattern happens to be similar to the grain size of the halftone pattern in the Qwen VAE, which means your image resolution and the distance to relevant objects matters
This is why beach sand in the foreground of a pic looks garbage, but it starts looking more normal further away from the camera
This is also why the hair of your character may sometimes look totally fine, while other times it looks like badly scribbled trash; it's all to do with how far it is from the camera + the resolution you're using

The models themselves have this pattern baked-in due to being trained with the qwen vae, so it can't realistically be fixed. You can reduce its effect by post-processing your images (such as by downscaling then upscaling them), and you can also mitigate the effect by adjusting your output resolution so that patterns in your image don't match the qwen grid size anymore.

You can also inpaint the bad parts of your image at low denoise with another model (like Z-image base/turbo) to fix it.

Krea 2 vs Z-Image Base

These are the important differences between the two models. Krea 2 has some big advantages, and it's pretty clear at this point that Krea 2 will overtake Z-image for most purposes. But there are a few things Z-image does better so far.

Z-Image Base generally does more realistic human skin (but not always) and is way better at facial expressiveness, even when using the censorship bypass for krea 2
- Some of the sample images I've shown are duplicates of the images I did in my Z-Image Base workflow post, you can look at them for comparison: https://www.reddit.com/r/StableDiffusion/comments/1qzncrz/zimage_base_simple_workflow_for_high_quality/
Z-Image Base is easier to get photorealistic images from, especially when using prompts that suggest unrealistic things
- This is partly because you can use CFG easily with Z-Image Base, but in general it seems Krea 2 has a stronger bias for 3D renders, digital artwork and other realism-adjacent styles
- For example, if you ask for a 'futuristic city' you'll probably get concept art of a city with Krea 2, rather than something that looks like a photograph - and it can be really really really hard to stop it from doing that
- If you ask for a character with inhuman features, like an elf, you're very likely to get a person that looks like a 3D render with Krea 2
- Even normal shots with no fantasy elements will sometimes unpredictably tend towards low-realism
- Z-Image Base, on the other hand, can generate photo-real pictures of unrealistic concepts very easily and will consistently output the most realistic images of any model (except maybe Ideogram, but I haven't played with that yet)
- Krea 2 can be just as realistic as Z-Image Base, it's just harder to prompt for it
Krea 2 leaves a subtle halftone grid pattern over every image (because of the Qwen VAE)
- It's not a big problem if you're doing high res gens, but it is annoying and Z-image base doesn't do it in the first place so it has the advantage there
Krea 2 sucks at hair and small patterns/particles (because of the Qwen VAE again)
- Z-Image, by comparison, is great at hair and has no issues with small patterns/particles
- There's info on why this happens in the Qwen VAE section above
Krea 2 tends to make "pretty" women even when not asked to, which can be very annoying
- This can be fixed with loras and finetunes in the future
- Z-Image Base, on the other hand, will generally make very realistic and casual people unless you ask it not to (or it's contextually suggested)
Krea 2 is more prompt adherent and can do more flexible things in general
- Except when you're asking for something that got ruined by the censorship, in which case Krea 2 is worse at following instructions than a toddler
- And except where point #2 about realism is concerned, but again that's fixable with loras
Krea 2 has a much better understanding of anatomy and body shapes, even for SFW prompts
Krea 2 is generally better at animals & animal fur (best I've seen from any model)
Krea 2 is less prone to random mistakes
Krea 2 is much more reliable when generating images with wide aspect ratios, like 16:9
Krea 2 generates images about 8x faster, which is huge
Krea 2 is much easier to train loras on
- I don't have any insight into this, I'm just repeating what the lora training folks are all saying
- For people doing gens, this means you'll get access to more loras faster and they'll generally be better too

Verdict?

Krea 2 is better than Z-Image Base when it comes to many things. There are some things, such as facial expressiveness, hair, generally realistic skin, and an easier time making photo-real images, where Z-image base is a better choice - but keep in mind it's a lot slower to gen with than Krea 2 is.

It's pretty obvious that Krea 2 is going to become the next SDXL thanks to its creativity and ease of training.

What about Krea 2 vs Z-Image Turbo?

idk I don't really use it, but probably the same list of advantages/disadvantages except Z-image turbo isn't as good at realism as Z-image base is.

So, how about issues 2 & 3...

With Krea 2, issues 2 & 3 above (the Qwen VAE issues) can be dealbreakers depending on what you're doing. If you do really need to solve issues 2 & 3, I suggest generating the image in Krea 2 and then doing small inpainting refinements with Z-image base/turbo on the problematic areas.

For example, you might generate an image of a person in Krea 2 and then do a 0.2 denoise refinement on just the hair of that person using Z-image base/turbo. This is of course only necessary if the hair is bothering you.

Resolutions & Aspect Ratios?

Krea 2 is a banger and can do high resolutions no problem, just like Z-image. I've left a bunch of common ones in the workflow, but you can probably go even higher - I just haven't tested that.

Unlike some models - even Z-image - Krea 2 is VERY capable of doing wide images, so don't be afraid of cinematic aspect ratios. It has a much higher success rate with anatomy and general correctness than I've seen with other models.

This means Krea 2 can make things like wide-screen desktop wallpapers very easily.

CFG?

If you're using RAW with the turbo lora, you can use CFG > 1. I've tested it with CFG = 2 and it turns out fine.

But do note that using CFG > 1 will double your generation time.

Sexy Loras?

If you're using unsafe-for-work loras at low strength, you should still leave the filter bypass lora on. It'll help. But if your unsafe-for-work loras are high strength then you can skip the bypass lora, it won't be doing much and might even interfere.

You can see lewd images on this post if you're on civitai red, and I've put the lora strength information in the prompt descriptions above the actual prompts. I used exactly the same loras & strengths for every picture here. I use this specific combination because it doesn't reduce Krea 2's creativity & aesthetics; you're welcome to use whatever loras you want. Just be aware that many of them change Krea 2's aesthetics, especially at high strength.

Most lewd loras don't need high strength to work well as long as you have the filter bypass on. Krea 2 already knows what anatomy looks like, it's just hidden behind the censorship filter.

You can find me as u/nsfwVariant on reddit to see some of the degenerate info I've posted, or you can go to the r/degenDiffusion subreddit to see tips / ask questions about such things.