About 160 Civitai nodes for ComfyUI, image, video, audio, text, and training, with built-in model browsing.

Civitai Nodes for ComfyUI

Smarter defaults, pay-per-step pricing, new sample controls, and a Train Further button.

Training Updates

<h1 id="turn-a-few-words-into-polished-images-and-videos">Turn a few words into polished images and videos</h1><edge-media url="975ff275-579a-4253-b42f-a88c673e5fa0" type="video" filename="AnimateDiff_00632.mp4"></edge-media><h2 id="llm-driven-multi-stage-diffusion-made-simple">LLM-driven, multi-stage diffusion made simple</h2>Imagine generating fully polished images and videos from just a few words — without spending hours refining prompts.This workflow combines:<ul><li>LLM-based story generation</li><li>Structured diffusion sampling</li><li>Multi-stage iterative upscaling</li><li>WAN 2.x video generation</li></ul>The result is a system where minimal input becomes structured, coherent, and visually strong output — while keeping anatomy stable and artifacts under control.This workflow is intentionally not visually “pretty.” It is structured for readable control flow rather than aesthetic node alignment.<hr /><h2 id="core-concept">Core Concept</h2>If you enter a very short prompt, for example:<code>1girl</code>the LLM will generate a structured story about the subject. It will invent details such as her appearance, clothing, expression, pose, environment, lighting conditions, mood, and contextual elements. The shorter your input, the more creative freedom the LLM has. The more specific your input, the more the generated story reflects your exact intentions.Your original prompt is never replaced. It remains part of the final positive prompt. The LLM output is appended and merged with it before sampling. This means you do not lose information by using the LLM layer.Scaling modifiers such as <code>(happy:1.2)</code> still work exactly as expected. They are passed through to the sampler and influence weighting normally.There are two separate input fields for both image and video generation. The description field is visible to the LLM. The keywords field is not. The keywords field is appended after the LLM output and is ideal for art styles, LoRA trigger words, or technical modifiers that you do not want the LLM to reinterpret.This separation allows clean structural generation while keeping stylistic control precise.<hr /><h2 id="system-requirements-and-setup">System Requirements and Setup</h2>This workflow was tested on a Ryzen 7800X3D, 64GB RAM, and an RTX 4090. For WAN video generation, 24GB VRAM is strongly recommended.Install Ollama first. Then open a terminal and run:<pre><code>ollama run mistral-small3.2</code></pre>This installs the mistral-small3.2 model. Keep Ollama running in the background while using the workflow.After loading the workflow in ComfyUI, open ComfyUI Manager and click Install Custom Nodes. Missing nodes will break parts of the pipeline.<hr /><edge-media url="32d8a457-f2ab-412f-9dbe-d38f540ec92d" type="video" filename="AnimateDiff_00631.mp4"></edge-media><hr /><h2 id="wan-2.2-dependencies">WAN 2.2 Dependencies</h2>For WAN video generation to work correctly, the following files are required:<ul><li>DaSiWa-WAN 2.2 I2V 14B TastySin v8<ul><li><a target="_blank" rel="ugc" href="https://civitai.com/models/2190659?modelVersionId=2466604">https://civitai.com/models/2190659?modelVersionId=2466604</a></li><li><a target="_blank" rel="ugc" href="https://civitai.com/models/2190659?modelVersionId=2466822">https://civitai.com/models/2190659?modelVersionId=2466822</a><ul><li>Place both in models/unet</li></ul></li></ul></li></ul><ul><li><code>Wan2_1_VAE_bf16.safetensors</code><ul><li><a target="_blank" rel="ugc" href="https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors">https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors</a><ul><li>Place in models/vae</li></ul></li></ul></li><li><code>nsfw_wan_umt5-xxl_fp8_scaled.safetensors</code><ul><li><a target="_blank" rel="ugc" href="https://huggingface.co/NSFW-API/NSFW-Wan-UMT5-XXL/tree/main">https://huggingface.co/NSFW-API/NSFW-Wan-UMT5-XXL/tree/main</a><ul><li>Place in models/clip</li></ul></li></ul></li></ul>If any of these are missing or incorrectly connected, WAN video generation will fail or produce broken output.<hr /><h2 id="important-wan-2.2-model-connection">Important – WAN 2.2 Model Connection</h2>Currently, the workflow is configured with UNET loaders (GGUF) connected.If you are using a safetensors WAN 2.2 model, you must:<ul><li>Disconnect the GGUF UNET loader</li><li>Connect the Load Diffusion Model nodes directly to the Video Lora nodes</li></ul>Otherwise, the video pipeline will not work correctly.This only applies if you switch to a safetensors version of WAN 2.2.<hr /><h2 id="how-to-use-the-workflow">How to Use the Workflow</h2>For image generation, enter a short or detailed description in the Image Description field. This is what the LLM sees. In the Image Keywords field, add LoRA triggers, art styles, or special tokens that you want appended after the LLM output.You do not need to add quality modifiers or negative prompts. Those are already included inside the nested workflow components.The NSFW toggle controls output type. Set it to 0 for safe content or 1 for NSFW output.For video generation, the same logic applies. The Video Description is processed by the LLM. The Video Keywords field is appended afterward for stylistic control.The Megapixels for Video parameter controls the resolution of the generated video. A value of 0.66 works reliably and provides a good balance between quality and performance. Higher values increase VRAM usage.For video length, 5 to 6 seconds tends to produce stable results. At 7 seconds or more, the model may start introducing looping artifacts or scene repetition.<hr /><edge-media url="c7da83ae-0f0e-470f-ac17-9e6a614ec4f0" type="video" filename="AnimateDiff_00628.mp4"></edge-media><hr /><h2 id="image-generation-and-upscaling-pipeline">Image Generation and Upscaling Pipeline</h2>The upscaling pipeline evolved over time. The goal was to improve detail while preserving anatomy and avoiding distortion, especially in hands and faces.The process begins with selecting an aspect ratio. An empty latent at 1 megapixel is created and padded to 32 pixel alignment. Alignment to 32 pixels is important because non aligned resolutions can introduce border artifacts.The base image is generated using 69 steps with Euler Ancestral and the Normal scheduler. Euler Ancestral produces strong structural foundations and dynamic compositions, which makes it well suited for the initial generation stage.<edge-media url="e8155279-7daa-4817-a849-954c2092efef" type="image" filename="2026-02-27-225837_ijsense_v10_0.png"></edge-media>The first upscale increases resolution by a factor of 1.4 using two iterative steps at 24 sampling steps each, with Euler and the Linear_Quadratic scheduler. Denoise is set to 0.21. This stage allows moderate refinement while the image is still small enough to prevent large scale distortions.<edge-media url="1eceb346-afd9-478e-be20-b6a362f7cd36" type="image" filename="2026-02-27-225850_ijsense_v10_0.png"></edge-media>Next comes face and hand refinement using DPM++ 2M with the Simple scheduler, 16 steps, and a denoise value of 0.21. DPM++ 2M is particularly good at preserving structure while improving micro detail. It helps stabilize anatomy and correct small irregularities without shifting the composition significantly.<edge-media url="8542ffed-f8e4-4e81-a6d9-8cd26b3a936d" type="image" filename="2026-02-27-225910_ijsense_v10_0.png"></edge-media>After that, a second controlled upscale increases size by another factor of 1.2 in two steps. Because the image is now significantly larger than typical training resolutions, this stage must be conservative. DPM++ 2M with Simple scheduler is used again, with 16 steps and denoise set to 0.2. The goal here is gentle refinement without introducing high resolution artifacts.<edge-media url="1eadff20-af7d-4460-92db-5bd34787e1d8" type="image" filename="2026-02-27-225925_ijsense_v10_0.png"></edge-media>The final smoothing pass uses DDIM with the Simple scheduler, 12 steps, and denoise at 0.18. DDIM is stable and predictable, making it ideal for subtle finishing touches.Upscaling tends to slightly desaturate images. To compensate, the final step performs color matching against the original base image to restore vibrancy and contrast.<edge-media url="cd25d6cd-2e23-4520-b823-5232716b75a2" type="image" filename="2026-02-27-225931_ijsense_v10_0.png"></edge-media>Through experimentation, the most reliable pattern has been multiple small controlled upscales rather than one aggressive jump in resolution.<hr /><h2 id="sampler-and-scheduler-insights">Sampler and Scheduler Insights</h2>Euler Ancestral is excellent for the first pass because it encourages variation and strong structural emergence. It can introduce creative diversity while still forming a coherent base.Euler without ancestral noise works better for controlled refinement. It reduces large structural shifts and is predictable.DPM++ 2M performs well during detail enhancement stages. It maintains anatomy and fine structure better than many alternatives when working at higher resolutions.DDIM is less aggressive and works well for final smoothing when you want stability rather than reinterpretation.Regarding schedulers, Normal provides balanced behavior during initial generation. Linear_Quadratic smooths the refinement curve and helps avoid sudden tonal shifts. Simple scheduler is consistent and stable during micro refinement.In general, denoise values around 0.2 appear to be a sweet spot for iterative upscaling. Higher values tend to break anatomy at large resolutions.<hr /><h2 id="video-pipeline">Video Pipeline</h2>For video generation, the final generated image is first downscaled to the megapixel value defined in the Video settings. It is then aligned to 32 pixels to prevent border artifacts.The WAN 2.2 workflow is then executed. Tiled VAE decode is used because the standard decode can run out of memory at higher resolutions.After decoding, the video is upscaled and RIFE frame interpolation is applied to improve motion smoothness. This produces more fluid animation and reduces visible stepping between frames.<hr /><edge-media url="959a4f39-b976-4739-8730-9d6084387498" type="video" filename="AnimateDiff_00635.mp4"></edge-media><hr /><h2 id="recommended-models-and-loras">Recommended Models &amp; LoRAs</h2><h3 id="favorite-base-models">Favorite Base Models</h3><ul><li>Illustrij v20<ul><li><a target="_blank" rel="ugc" href="https://civitai.com/models/1025051?modelVersionId=2546233">https://civitai.com/models/1025051?modelVersionId=2546233</a></li><li>Superb model, I have not tested v21 yet, but I am sure it is excellent as well.</li></ul></li><li>IJsense v1<ul><li><a target="_blank" rel="ugc" href="https://civitai.com/models/2351513/ijsense">https://civitai.com/models/2351513/ijsense</a></li><li>Significantly more realistic than Illustrij, I love it, but I mostly use it in combination with</li></ul></li><li>Animij v8<ul><li><a target="_blank" rel="ugc" href="https://civitai.com/models/1353314?modelVersionId=2579397">https://civitai.com/models/1353314?modelVersionId=2579397</a></li><li>Significantly more anime than Illustrij, but the combination with IJsense looks awesome. I tried v9, but the images did not come out as cleanly as this combination.</li></ul></li></ul>However, I will say just go to <a target="_blank" rel="ugc" href="https://civitai.com/user/reijlita/models">https://civitai.com/user/reijlita/models</a> and download any model, they are all awesome!<h3 id="favorite-loras">Favorite LoRAs</h3><ul><li>Dramatic Lighting Slider - Illustrious<ul><li><a target="_blank" rel="ugc" href="https://civitai.com/models/1128288/dramatic-lighting-slider-illustrious">https://civitai.com/models/1128288/dramatic-lighting-slider-illustrious</a></li><li>Looks great with strength 2.0-4.0, greatly enhances lighting</li></ul></li><li>Dlang_Detailed eyes-Illustrious<ul><li><a target="_blank" rel="ugc" href="https://civitai.com/models/1559314?modelVersionId=1764525">https://civitai.com/models/1559314?modelVersionId=1764525</a></li><li>Just creates great looking eyes, normally I use strength 0.5</li></ul></li><li>Detailer IL<ul><li><a target="_blank" rel="ugc" href="https://civitai.com/models/1231943/detailer-il">https://civitai.com/models/1231943/detailer-il</a></li><li>Detailer that works great, activation keyword is already included in the nested workflow node. Strength normally 0.3</li></ul></li><li>People's Works +<ul><li><a target="_blank" rel="ugc" href="https://civitai.com/models/1400090?modelVersionId=2524593">https://civitai.com/models/1400090?modelVersionId=2524593</a></li><li>It's what the people want. Strength normally 0.3</li></ul></li></ul>Just try whatever you want.<hr /><h2 id="final-thoughts">Final Thoughts</h2>This workflow focuses on clarity of control flow, structured prompt expansion, and conservative high resolution refinement.Minimal input is enough to produce rich output. Detailed input is preserved and respected. The LLM does not replace your prompt, it enhances it.If you prefer direct manual prompting, you can still use the workflow without relying heavily on the LLM. But when used as intended, it significantly reduces prompt micromanagement while improving scene coherence and anatomical stability.

2026-02-28-011446_ijsense_v10_761110141667019.png

LLM-Enhanced Video Workflow

36749300.png

physical violence

weapon violence

wide hips

revealing clothes

thick thighs

downblouse

convenient censoring

huge breasts

pg-13

corpses

suggestive

oral invitation

pg13

sexy

sexual situations

male nudity

disturbing

male swimwear or underwear

female swimwear or underwear

partial nudity

undressed

female nudity

breasts out

exposed female nipple

breast out

lingerie

male underwear

hair over breasts

female swimwear

gigantic breasts

no panties

graphic violence or gore

covered nipples

huge butt

strapless leotard

sitting on face

emaciated bodies

one breast out

nsfw

female underwear

nude

graphic male nudity

adult toys

illustrated explicit nudity

nudity

graphic female nudity

hentai

futanari

porn

sexual intent

genitals

peeing

vore

oral

sexual activity

anal

blowjob

dildo riding

incest

hanging

hate symbols

nazi party

white supremacy

diapers

scat

self injury

hate speech

urine

extremist

child on child

latex clothing

swimwear

bukkake

fellatio

cumshot

implied fellatio

eat_cum

cumdrip

cum in pussy

cum on face

after fellatio

cum on hair

cum on body

cum on tongue

cum on hands

cum in mouth

triple fellatio

autofellatio

fucked silly

cum on pussy