Sign In

comfy-vLLM

Updated: Apr 11, 2026

tool

Download

1 variant available

Archive Other

90.3 KB

Verified:

Type

Other

Stats

25

Reviews

Published

Apr 11, 2026

Base Model

Other

Hash

AutoV2
C514AA7224
default creator card background decoration
I7

i73

vLLM Prompt Node for ComfyUI

https://www.theoath.studio/projects/comfy-vllm-node?utm_source=civitai

A ComfyUI custom node that generates Stable Diffusion prompts using a locally running vLLM server. Supports wildcard expansion and a fixed prefix for quality tags or style anchors.

Installation

  1. Clone or copy this folder into your ComfyUI/custom_nodes/ directory:

cd ComfyUI/custom_nodes
git clone https://github.com/OATH-Studio/comfy-vLLM
  1. Restart ComfyUI.


Requirements

  • A running vLLM server (see vLLM docs)

  • Python package: requests (pip install requests)

  • ComfyUI


Setup

Start your local vLLM server. The node will automatically detect whichever model is currently loaded. No need to specify it in the node.

Example launch:

vllm serve ./models/Qwen2.5-3B \
--host 0.0.0.0 \
--port 8765 \
--served-model-name Qwen2.5-3B

Note: The node queries /v1/models on each generation and uses the first model returned. If you change models, restart your vLLM server. The node picks it up automatically.


Node Inputs

InputTypeDefaultDescriptionpromptSTRINGGeneration instruction. Supports {wild|card} syntax.prefixSTRINGmasterpiece, best quality, highresFixed tags prepended to the output. Not sent to the model.hostSTRINGlocalhostvLLM server host.portINT8765vLLM server port.max_tokensINT128Maximum tokens to generate.temperatureFLOAT0.7Sampling temperature. Higher = more creative.retriesINT3How many times to retry on empty or failed responses.


Node Output

OutputTypeDescriptioncombined_promptSTRINGprefix + generated text, ready to wire into CLIPTextEncode

The node displays a live preview after each generation showing:

  • Prefix

  • Raw generated text

  • Final combined string


Wildcard Syntax

Use {option1|option2|option3} anywhere in your prompt. One option is chosen at random each run. Multiple wildcards are resolved independently.

A {red|blue|green} dragon, {breathing fire into the sky|coiled around a mountain peak in a storm|diving into a glowing ocean abyss|rearing up against a blood moon}

Wildcards are expanded before the prompt is sent to the model, so the model always receives a fully resolved string.


Example Workflow

VLLMPromptNode ──→ CLIPTextEncode (positive) ──→ KSampler
       ↑
CLIPTextEncode (negative) ───┘

Prompt Format

The node uses the completions endpoint with a structured format that forces the model to return comma-separated tags only:

### Stable Diffusion prompt tags (comma separated, no sentences):

Input: <your expanded prompt>

Output:

Generation stops at the first newline, preventing extra text.

If conversational output appears:

  • Lower temperature to 0.3–0.5

  • Use a larger model (≥ 1.5B recommended)

  • Reduce max_tokens


Model Recommendations

ModelQualityNotesQwen2.5-0.5B⚠️ UnreliableToo small for consistent instruction followingQwen2.5-1.5B✓ UsableOccasional filler, mostly cleanQwen2.5-3B✓✓ RecommendedClean output, follows format reliablyQwen2.5-32B✓✓✓ BestOverkill but flawless


Tested With

  • vLLM 0.4+

  • Qwen2.5-0.5B, Qwen2.5-1.5B, Qwen2.5-3B

  • ComfyUI (latest)