TL;DR

As of 2026/05/22, this article is a work in progress, and will be updated ~~during the next few days~~ whenever. Added Flux.2 Klein workflows - both classical mode (txt2img, img2img, inpaint) and instruction-following edit mode (basic edit and inpaint edit). Added Ernie-Image workflows.

This article collects links to my set of ComfyUI workflows that I use for generating and working with AI art. In this article, I also share some of my observations and usage tips for each model family.

The workflows are embedded in images published in a series of image posts on my channel, and linked from here. There is one post per model family.

The images are PG to have the content accessible for anyone.

How to use: drop the relevant image into your ComfyUI to load the workflow. Then, in ComfyUI, you can click the "hamburger" ("≡") at the top left corner of the user interface (or the one on the header of the active Comfy tab), and select Save As to save the workflow into your workflows.

NOTE 2026/01/26: The layouts of these workflows were designed for ComfyUI's legacy node system, and have unintended visual overlaps when the Nodes 2.0 option in the GUI is enabled. I might release another version of the workflows for Nodes 2.0 later. Right now, when I tried it, the GUI looked nicer than before, but previews consistently stopped working after a few renders. So, I'll give the ComfyUI authors a bit of time before trying Nodes 2.0 again.

Introduction

As I began to write this, it was January 2026. It had been a hectic half a year since summer 2025. New image-generator AI models had appeared at a breakneck pace, and my time available for hobbies had not been able to keep up. Since then, I've been diving into new model families and getting a feel as to what works and what doesn't, but not publishing many images. During this time, I have also switched to ComfyUI as my main visual GenAI tool.

ComfyUI is... in short, the visual GenAI community's favorite flight simulator. While those of us into actual aviation simulation may beg to differ, a typical Comfy workflow has at least so many knobs and buttons that I keep looking for the flap lever and the pitch trim, and idly wondering what the ICAO code for today's destination in latent space is.

Seriously though, as ComfyUI is a construction kit rather than a simple tool where you just enter a prompt and press Generate, there has been a lot of workflow building happening behind the scenes. Since I like to run all my image-generator AIs inside one environment, I've been migrating my working habits from SD Forge to ComfyUI. This has meant the need to build txt2img, img2img and inpaint workflows for each model family. While ComfyUI does provide some default workflows, there's always something I want to do differently - e.g. to use GGUF quants, or to have a zoomed-in inpainter like the "Masked only" mode in SD Forge.

So I'm introducing Mathemagic's ComfyUI Workflows, as my attempt to make things as simple as reasonably possible, but no simpler.

This workflow package has certain focus areas:

Txt2img, img2img and inpaint workflows for each classical model (where it makes sense).
- These work like the three main modes of SD Forge.
Basic edit and inpaint edit workflows for each edit model.
- The latter means that you can apply an edit model to a masked region, without touching other parts of the image.
Optimized for interactive work (as opposed to batches).
- KSampler with live preview, so if the composition turns out botched, you can see it immediately, and cancel without waiting for the render to finish.
- Obsessive focus on fast renders with decent quality, rather than 5% more quality at the cost of 4× more render time. Generally, as few steps as I can get away with (DPM++ 2M with SGM Uniform helps a lot here), and for Qwen models, loading the Lightning accelerator LoRA.
- Autosaving for each finished render, with the filename containing an ISO timestamp (yyyymmddThhmmss, in local time), RNG seed, model name, sampler name, steps, and CFG value.
Optimized VRAM usage, using Q4 GGUFs almost everywhere.
- Some people prefer raw FP16 or various 8-bit quants, but in my experience, Q4_K_M works just fine, and saves a lot of VRAM.
- In the dynamic quants by Unsloth, the quantization level is locally optimized so that more sensitive layers are compressed less (e.g. some layers can be even Q8). See full explanation.
- This essentially gives almost 8-bit quality at a much smaller file size, as well as a significantly smaller VRAM footprint. So e.g. for Flux.2 Klein, you can keep both the image generator (9B) as well as the text encoder (8B) in VRAM simultaneously on a single 24GB GPU, enabling you to iterate on the prompt quickly.
- The cost is it's a bit slower (but not much), as it needs to decompress on the fly.
Simple, clear, visually readable node graph layouts.
- Nodes positioned to make it easy to see what connects where.
- All links visible.
- Data flows from left to right, as far as reasonably possible.
- No unnecessary nodes.
- Subgraphs are used sparingly. Often the spelled-out solution teaches more, as it spares the reader the need to learn the new abstraction (Graham, 1992).
- Most of the workflows fit onto one screen at 4K resolution; none are much larger than that.
A small amount of extra complexity is allowed to support essential features:
- Zoomed-in inpainting (crop-and-stitch), similar to SD Forge.
- LoRA. This is mainly to document at which point to insert LoRAs and how.
- GGUF loader for anything larger than Illustrious-XL, because VRAM is always a limitation, and GGUF quants (especially the new Unsloth dynamic ones) work well in practice.
- In img2img and inpaint workflows, automatic adjustment of the number of steps based on the chosen denoise level, like SD Forge does.

Extensive comments as Markdown note nodes, to make these workflows usable also as tutorials.
- A part of this article is actually in those nodes inside the individual workflow files.
As few 3rd party node packages as reasonably possible.

I'm publishing the workflows as embedded in image files, where - for models which have text rendering capability - the text on the sign held by the character says which workflow is embedded in the image. Models sometimes get the text right, and sometimes produce something funny that nevertheless (mostly?) gets the message across.

In the case of Illustrious-XL, which is too small and old to be able to generate text, one just has to guess which is which. Or more easily, to look at the filenames when downloading the images.

I'm including workflows for Illustrious-XL in my set, because even though its SDXL architecture is already showing its age, Illustrious-XL is still unmatched at its sublime digital painting anime style, while it also supports the largest variety of niche topics seen in any anime model so far. In some aspects of character specification (such as hairstyles, and breast size for females), as of 01/2026, Illustrious-XL still beats newer models.

It could be said that in the past year or so, the state of the art, and how to apply it, has changed. Unlike in the SD 1.5, SDXL, PonyXL and Illustrious-XL days, where the latest model was best for everything, now different models are best at different things. It is increasingly useful to keep a copy of your old (but still serviceable) models around, in case they do some things better than recent releases.

What this of course means for the visual GenAI artist is that now there is more need than ever to jump between different checkpoints - and even model families, with wildly different prompting styles - while creating a single still-image artwork. The world would be a simpler place if there was one generalist model to rule them all, but it would have to cover every obscure anime trope (including many for R18 audiences) before there is the slightest chance of superseding the capabilities of the collection of models we already have.

So, proposing a toast for variety, without further ado, let's get into the...

Workflows

The images with the embedded workflows are in the following series of image posts, because articles do not support inline PNG images.

There is one image post per model family:

Flux.2 Klein, which includes both classical mode and edit mode [new 2026-05-22]
Qwen, which includes both Qwen-image and Qwen-Image-Edit
Ernie-Image [new 2026-05-22]
Z-Image Turbo
Chroma1-HD
Illustrious-XL
Tools: background remover
TBA - Tools: pose detector
TBA - Tools: SD prompt reader

The individual workflows are linked below, in each section.

When using these, to set values such as denoise exactly, click the field, enter a number on the keyboard, and press enter. If you set e.g. 0.55 denoise, ComfyUI will actually use that value, even if the GUI field shows 0.6 (the actual value rounded to one decimal place) and clicking the right/left arrow buttons changes the value by a larger amount.

As the license for all of the workflows, I choose WTFPL - feel free to do whatever you want with these workflows, including publishing modified versions. In fact, I would like to encourage you to take a look inside, grab any interesting parts and/or ideas, and run with them.

Flux.2 Klein - Classical mode

[NEW 2026-05-21]

As of 05/2026, this new version of Flux changes the game again. Photographic output (especially humans) tends to look much more realistic than Qwen's, and it can draw anime too. As an important bonus, it's smaller than Qwen - only 9B instead of 20B - so it runs faster as well as takes less VRAM.

This model packs both what I've termed classical mode - where the prompt describes what the image contains - as well as edit mode, which takes in reference images, and where the prompt contains instructions as to what to do with them. Importantly, in Flux.2 Klein, these modes are packed into the same model file - unlike Qwen, no separate edit checkpoint is needed.

Classical mode workflows are here (in the same post as edit mode below). Direct links:

Be aware that there exist both 9B and 4B variants of Flux.2 Klein.

Having tested both, the 4B is too small to be useful with current tech - backgrounds end up greebled SDXL style, and text rendering is mostly a miss. The 9B works fine. Text rendering even in the 9B may still require a few rerolls, but still, it's almost on par with Qwen in that regard - in a model that's much smaller. This is impressive.

The text encoder (TE) for the 9B variant is Qwen3 8B - so Flux.2 Klein comes to 17B in total; compared to Qwen-Image's 20B + 7B, i.e. 27B in total. Therefore, compared to Qwen, the new Flux.2 Klein has 63% of the total parameter count (including the TE), and during inference (image generator only), 45%. Also, the 4-step turbo distillation is baked in, so it doesn't need an accelerator LoRA, making it over twice as fast as Qwen.

The LoRA scene seems somewhat alive too, so as of 2026/05, Flux.2 Klein is a space worth keeping an eye on.

Flux.2 Klein - Edit mode

[NEW 2026-05-22]

The Flux.2 Klein model supports reference latents. This... changes things.

No, literally! This model changes things when you tell it to - just like Qwen-Image-Edit does. You can apply clothing from a reference image onto your OC, as well as place a specific OC into a new scene. This is very impressive.

Edit mode workflows are here (in the same post as classical mode above). Direct links:

Basic edit, with up to 3 input images
Inpaint edit, with a main image to edit and an optional extra reference image for the edit.

Qwen-Image 2512

As of 01/2026, the Qwen family of models is the go-to choice for complex prompt adherence, if you have the hardware to run a 20B model without losing your sanity. The model is developed by Alibaba, the same company that is behind the Qwen series of LLMs, which have done well as locally hostable thinking models. The model's official GitHub page is here.

The outputs suffer way less from AI greeble than earlier models, but since the principles behind the technology have not changed, the model still does not have an actual understanding of global geometry. If you generate character art, you will still get the occasional fence or shelf whose ends do not meet behind the character. And you will still occasionally get the wrong number of fingers, if you render anything except basic portraits.

(Note that there is also Qwen-Image-Layered, which is a different model that tackles the global geometry issue like a human digital artist would - by drawing the image as a stack of separate layers. I have not tested it myself, but ComfyUI has a default workflow available.)

Text rendering of Qwen is among the best, at least in the sphere of open models. English and Chinese are supported (no Japanese and no European languages with umlauts). The model will still misspell occasionally. Some words (such as "img2img" or "inpaint") it fails to spell correctly at all. Inpainting can be useful for fixing words that the model can spell, but does not get right 100% of the time.

Workflows are here. Direct links:

These workflows are compatible also with the original Qwen-Image (summer 2025, no version number).

Qwen-Image-Edit 2511

Qwen's sister with leet skills in photoshopping. Main open competitor to Google's proprietary Nano Banana.

Edit is especially useful when you already have a character illustration (e.g. rendered by another model), and want to render that character in different clothing, in a different pose, in a different environment, or doing a different activity. It can also generate front / side / back views, which is convenient for compiling character sheets. As this cranks character consistency up to eleven, this should also make it possible (but I haven't tried) to draw comic strips with GenAI, at least one panel at a time.

Furthermore, multiple image inputs (supported since version 2509) allow combining several given characters in the same image, and changing a character's pose based on a DWPose pose image (those colorful stick figures traditionally used with ControlNet).

Multiple inputs also allow transferring given clothing to a given character. The input can be an image of another character wearing that clothing, or a bare clothing image. You can also use Edit to generate a new character wearing given specific clothing. Open GenAI dress-up has arrived!

Workflows are here. Direct links:

Basic edit, with up to 3 input images
Inpaint edit, with a main image to edit and an optional extra reference image for the edit (can be trivially extended to 2 references, I just haven't needed that)

These workflows are compatible also with the older 2509. I'm mentioning this because some people's mileage with 2511 varies.

EDIT 2026/01/26: My color-burning issues with Qwen-Image-Edit-2511 (see the image post with the workflows) were a ComfyUI version issue. Commit 56fa7dbe380cb5591c5542f8aa51ce2fc26beedf from 7 December 2025 had the issue, but commit 7ee77ff038937bdfdbea5d603ad8d4c487c14fd6 from 25 January 2026 works fine.

EDIT 2026/05/22: Some Qwen-Image LoRAs - especially concepts - work as-is with Qwen-Image-Edit. Worth giving it a try if you need to make an edit where the checkpoint has a hole in its knowledge.

Ernie-Image

[NEW 2026-05-22]

As of 2026/05, this is a recent Flux.2 based contender. Some users have praised its flexibility, and its anime output looked interesting, so I thought I'd give it a try.

Workflows are here. Direct links:

For this model, I don't have much to report on at the moment. Rendering requires 8 steps instead of Klein's 4, and I wasn't that impressed by the quality I was able to get out of it. But it could just be I'm using it wrong. Your mileage may vary.

Be aware that this model uses Ministral (a Mistral 3B) as its text encoder (TE). As of 2026/05, this TE isn't supported by the GGUF node package by city96. Use the gguf (lowercase) node package by calcuis instead, and make sure you have a recent version (git pull to update yours if needed). Links included in notes inside the workflow.

Z-Image Turbo

As of 01/2026, Z-Image Turbo is another recent model that shows promise. This is 6B, vs. Qwen's 20B, so the model runs faster and uses less VRAM. The model itself is step-distilled (like Flux.1 Schnell), so it doesn't need an accelerator LoRA. It supports 8-step rendering out of the box.

This is another model release by Alibaba, but apparently by a different team. The model's official GitHub page is here. The authors have hinted that base (non-distilled) and edit versions may be upcoming, but as of this writing, those have not been released yet.

In my tests, Z-Image Turbo seems over-focused on portraits. It can be difficult to get a full body shot of a character even when you prompt for it. Mentioning "feet" in the prompt doesn't often help. The model prefers to change the view direction (e.g. looking down to show the feet), or to contort the character's pose (e.g. feet up, knees bent, when sitting on a chair), rather than moving the camera further away.

But when you can get the model to do what you want, the output looks nice, and prompt adherence is almost as good as Qwen's. Text rendering is also good, but specific words may fail. In my tests the model invariably spelled "turbo" as "tubro".

For photorealistic character renders, the word on the street is that Z-Image Turbo generates skin detail better than Qwen.

Also, I'll take the opportunity to point out that considering one of my OCs, Liz the nerdy university student (who has probably become the unofficial mascot for my channel), that out of all imaginable capabilities that an image GenAI could have, Z-Image Turbo is excellent at drawing dental braces, especially with the inpainting workflow. It's also decent at drawing semi-opaque nerdy glasses that are not fully opaque. So if you need such visual tropes for your nerdy OCs, this model can be useful as an inpainter.

Workflows are here. Direct links:

Chroma1-HD

Chroma is essentially a pruned-down (8.9B, down from 12B) and fine-tuned Flux.1 Schnell, which attempts to undo the step distillation, so that it can use CFG higher than 1. Thus, the negative prompt is available. According to the model author, beside this, the main point of the project was to create a Schnell-based checkpoint that's fine-tunable for further training.

The model is rather creative, as in the same prompt can produce many different kinds of images by varying the seed, like SD 1.5 and SDXL. However, like those early models, Chroma will also generate copious amounts of slop. Depending on your use case, you may need to fish for a decent RNG seed for a while.

The word on the street is that Chroma works well for rendering photorealistic images. Its popularity as well as the availability of a negative prompt for something more advanced than SDXL piqued my interest, so I tested Chroma for generating illustration images. I found its capabilities to be hit or miss. Especially if you ask for both anime and scifi in the same prompt, the model seems to know only one style - a full-color pre-production sketch from a "making of" artbook, or perhaps an illustration suitable for a tabletop RPG manual. On top of that, illustrations often turn out like mediocre fanart, no matter which quality tags are in the prompt.

As its prompt format - at least when used for illustrations - Chroma accepts a hodgepodge of natural language and comma-separated booru tags. For example, you can write a paragraph of natural language, then a paragraph of booru tags, and then switch back to natural language for another paragraph.

Chroma being a Flux.1 fine-tune, it has some text rendering capability, but newer models such as Qwen and Z-Image usually do better here. But curiously, when making these workflow images, Chroma was the only model that could spell all of "txt2img", "img2img", "inpaint", and its own name.

Since Chroma is not that great at rendering illustrations, I haven't used the model much, other than to note (after extensive testing) that it's not suitable for my use cases.

Compared to Lightning-accelerated newer models, it's also slow. I like having a negative prompt, but my patience has its limits, lol.

I'm providing the workflows, as they may be useful for other use cases. Txt2img one could do with ComfyUI's default workflow, but a zooming inpainter isn't readily available.

Workflows are here. Direct links:

Illustrious-XL

Old but still serviceable model family from 2024, based on the SDXL architecture. The model is only 3B, so it runs fine at FP16 even with just 8GB of VRAM, completing a 20-step render in a very short time. This model is small enough not to need an accelerator or step distillation.

Illustrious excels at rendering digital painting anime style characters. Unlike many models that associate the word "anime" with flat colors and spiky hair of the 1980s and 1990s, the Illustrious style looks like modern 2D CG anime made in the 2000s or later.

Unlike newer models that essentially use the input side of an LLM as the text encoder, the SDXL architecture uses classical CLIP models. It supports prompt weighting, which both SD Forge and ComfyUI expose with the syntax "(some important term:1.2)". Usually a good range for the weighting is 0.8 ... 1.2, but I've sometimes gone up to 1.6.

Anime SDXL models used to require clip_skip = 2 (empirically, the second-last CLIP layer had embeddings that yielded the best prompt adherence for those models), but with Illustrious I haven't bothered with that, and it works fine.

In SD Forge, FreeU was a nice technology to improve SDXL outputs, and SD Forge Couple gave the ability ~~to render couples~~ to target different regions with different prompts to allow rendering multiple OCs in the same image without prompt leakage between them. I haven't explored if anything similar to the latter two technologies is available for ComfyUI or not. Nowadays, I mostly render with newer models, where default quality is fine, and prompt leakage is (at least almost) a solved problem.

Illustrious remains particularly useful for creating characters to use as inputs for Qwen-Image-Edit, particularly with character designs that Qwen simply doesn't understand. These include at least women with very short hair (a pixie cut) and/or small breasts (when also "anime" is mentioned in the prompt). But if the character needs a shirt with text, then you'll additionally need to use a newer model (Qwen, Z-Image, or maybe Flux) to separately add the text in inpainting.

The Illustrious workflows do not use an accelerated or step-distilled model. Thus, the negative prompt is available. This is especially great for concepts that don't have a positive tag. An example is "twintails" but not "low twintails" - a "high twintails" tag does not exist. I would also love to be able, in newer models, to specify for some OCs that the hairstyle should have "bangs", but not "sidelocks". No such luck - but in Illustrious, the negative prompt allows you to do exactly that.

Unlike some newer models here, which prefer natural language prompts, Illustrious is an oldschool anime model whose native prompt format is a comma-separated list of booru tags. You can try natural language (at least for describing things that don't have a tag), but you may get better prompt adherence with a list of tags where possible.

Also, Illustrious gives you better quality gens if you end the positive prompt with "newest, masterpiece, best quality", and the negative prompt with "sketch, monochrome, oldest, worst quality". If your particular checkpoint recommends something else, use that instead.

The model is small by 2026 standards, so one should not expect too much out of it in terms of prompt adherence for very complex prompts. But what it can do, it does well.

If you haven't used older models, you'll find a lot of example prompts on this site by searching for Illustrious.

Workflows here. Direct links:

These workflows are compatible also with any model that uses the SDXL architecture. These include base SDXL finetunes, as well as PonyXL models. But note that if you use LoRAs, they are specific to each of the three types (IL, Pony, XL).

Tools

Background remover

This is useful for extracting a character from an image that has a background.

InSPyReNet is a highly accurate, fully automatic background remover. At least in my experience, this model often produces more accurate results than e.g. isnet or u2net (which are offered e.g. by the rembg extension for SD Forge).

This is a really simple workflow, using this ComfyUI node. This workflow is mainly published for completeness, as well as to raise awareness for this excellent neural background remover, and for the ComfyUI node that allows using it in Comfy. I'm not affiliated with either of these.

The workflow is here.

Example input image, created with Qwen-Image-2512, with inpainting.
Foreground mask. This image contains the workflow.
Result. RGBA image, with the mask in the alpha channel. This image also contains the workflow.

How to manually polish the mask

Most often, the model Just Works. But occasionally, you may need to open the image and mask in a photo editor (such as GIMP or Photoshop), and tweak the mask manually.

How to do this in GIMP:

Open the original image (with character and background).
Add a layer mask to the image. (Right-click the image in the Layers panel to do this.)
Open the mask image produced by InSPyReNet. Select all, copy.
Go back to your original image. Click on the layer mask in the Layers panel to tell GIMP you'll be drawing to the mask (not to the image itself).
Paste. Select none (Ctrl+Shift+A). Now the layer mask should be your mask image.
Edit the mask image with your leet 'shopping skills.
Right-click the layer in the Layers panel, convert the layer mask to alpha channel.
Export PNG.

Pose detector

TBA

This is especially useful with Qwen-Image-Edit, which accepts these pose images natively as image inputs.

https://github.com/Fannovel16/comfyui_controlnet_aux

SD prompt reader

TBA

This is really just a minimal example, to raise awareness of this node package for those users who don't yet know about it.

There's a standalone app called SD Prompt Reader, which can dig out metadata from images generated by SD Forge. The SD prompt reader node package brings this functionality into ComfyUI.

The package also includes a Prompt Saver node, which saves a copy of your metadata in A1111 format (into the image file) so that CivitAI autodetects it when you upload the image. I haven't used this in my workflows, though.

https://github.com/receyuki/comfyui-prompt-reader-node

Mathemagic's ComfyUI Workflows

TL;DR

Introduction

Workflows

Flux.2 Klein - Classical mode

Flux.2 Klein - Edit mode

Qwen-Image 2512

Qwen-Image-Edit 2511

Ernie-Image

Z-Image Turbo

Chroma1-HD

Illustrious-XL

Tools

Background remover

Pose detector

SD prompt reader

Meta

Which models are not covered (yet)

Criteria for choosing models

Full list of ComfyUI node packages used by these workflows

About the attachment