🎨 VIBE for ComfyUI
This is a custom node implementation of VIBE (Visual Instruction Based Editor) for ComfyUI. It allows you to edit images using natural language instructions (e.g., "make it winter", "change the dog to a cat", "remove the background").
The workflow leverages the efficient Sana1.5-1.6B diffusion model and Qwen3-VL-2B-Instruct for fast, high-quality image manipulation directly on your GPU.
📺 Video Demo
See VIBE in action! This video showcases various styles and the incredible speed of the model:
🖼️ Example Workflow
Drag and drop this image into ComfyUI to load the workflow:

💡 Best Practices & Pro-Tips
To get the best results from VIBE, keep these observations in mind:
1. Resolution Matters!
The VIBE model (based on Sana) is optimized for high resolutions.
Recommendation: For best detail preservation and anatomical correctness, use resolutions above 2.5 Megapixels (approx. 1600x1600 or higher).
At lower resolutions (e.g., 1024x1024), the model may struggle with fine details or produce "smooth/blurry" textures due to the VAE bottleneck.
2. Avoid "Burn-in" / Degradation
If you are performing iterative editing (passing the image through the node multiple times), you might notice increased saturation or contrast artifacts.
Solution: Change the Seed to randomize or increment for every generation.
Using a fixed seed repeatedly on the same image tends to accumulate model bias and artifacts. A variable seed distributes the noise differently each time, keeping the image cleaner.
3. Text-to-Image (Experimental)
While VIBE is an editor, it can technically generate images from scratch.
How to use: I have left an Empty Latent Image node in the workflow connected to the latent input. If you disconnect the input image, VIBE will generate from pure text.
Warning: The quality will be lower than dedicated T2I models (like FLUX or SDXL) because VIBE is not trained for generation from scratch. Treat this as an experimental feature for curious users.
⚡ Easy Installation (via ComfyUI Manager)
Load this workflow into ComfyUI.
Open ComfyUI Manager and click "Install Missing Custom Nodes".
Restart ComfyUI after the installation finishes.
Once reloaded, locate the VIBE Image Editor node (in Step 2) and click the "Check / Download Model" button. This will automatically download the necessary weights.
(Alternatively, you can follow the manual installation process below)
🛠️ Installation
1. Install the Node
Navigate to your ComfyUI/custom_nodes folder and clone the repository:
cd ComfyUI/custom_nodes
git clone https://github.com/ato-zen/ComfyUI-VIBE2. Install Dependencies
Open your terminal inside the custom_nodes/ComfyUI-VIBE folder and run:
cd ComfyUI-VIBE
pip install -r requirements.txt📂 Model Setup
The node automatically looks for weights in: ComfyUI/models/vibe/
You need to download the weights manually as they are large.
Create the directory:
cd ComfyUI/models
mkdir vibe
cd vibeClone the weights:
(Ensure you have git-lfs installed)
git clone https://huggingface.co/iitolstykh/VIBE-Image-EditYour folder structure should look like this:
📂 ComfyUI/
└── 📂 models/
└── 📂 vibe/
└── 📂 VIBE-Image-Edit/
├── model_index.json
├── 📂 scheduler/
├── 📂 text_encoder/
├── 📂 tokenizer/
├── 📂 transformer/
└── 📂 vae/💻 Hardware & System Compatibility
VIBE (Sana 1.5) is a cutting-edge model that requires modern hardware features (Flash Attention 2 and Triton kernels) to function.
✅ Full Support: NVIDIA RTX 30xx / 40xx / A-series (Ampere, Ada, Hopper). Best performance and native BF16 support.
⚠️ Partial Support: NVIDIA RTX 20xx. May work but might encounter speed issues or black image (NaN) errors.
❌ Unsupported: NVIDIA GTX 10xx (Pascal) & older. These cards lack hardware support for the required Triton kernels. If you see the error: "GET was unable to find an engine", your GPU is likely too old.
❌ Unsupported: AMD or Apple Silicon (M1/M2/M3). The model is strictly tied to NVIDIA CUDA/Triton.
🌐 OS Support: Linux is highly recommended. Windows users should use WSL2; native Windows support for Triton is currently unofficial and unstable.
VRAM Requirements:
Minimum: 12GB VRAM.
Recommended: 24GB VRAM (required for high-quality 2K / 1600px+ workflows).
🐞 Known Issues & Support
If you encounter the error GET was unable to find an engine to execute this computation, it is a hardware limitation of older NVIDIA cards.
How to Report a Bug
If you are on a supported GPU and encounter issues, please check your terminal logs and open an issue on GitHub:
👉 Report Issue on GitHub
Please always include your GPU model, Operating System, and the full console error log.
🔗 Links & Credits
VIBE Custom Node: github.com/ato-zen/ComfyUI-VIBE
Model Weights (HF): iitolstykh/VIBE-Image-Edit
Original Research: ai-forever/VIBE
🐞 Report Issues
If you encounter any bugs with the node, please report them on GitHub:
Report Issue Here


