Reference-driven video generation using Wan2.1 14B
**Who it's for:** creators who want this pipeline in ComfyUI without assembling nodes from scratch. **Not for:** one-click results with zero tuning — you still choose inputs, prompts, and settings.
### Open preloaded workflow on RunComfy
[Open preloaded workflow on RunComfy (browser)](https://www.runcomfy.com/comfyui-workflows/comfyui-phantom-subject-to-video?utm_source=civitai&utm_medium=referral)
**Why RunComfy first**
- **Fewer missing-node surprises** — run the graph in a managed environment before you mirror it locally.
- **Quick GPU tryout** — useful if your local VRAM or install time is the bottleneck.
- **Matches the published JSON** — the zip follows the same runnable workflow you can open on RunComfy.
**When downloading for local ComfyUI makes sense** — you want full control over models on disk, batch scripting, or offline runs.
**How to use (local ComfyUI)**
1. Load inputs (images/video/audio) in the marked loader nodes.
2. Set prompts, resolution, and seeds; start with a short test run.
3. Export from the Save / Write nodes shown in the graph.
**Expectations** — First run may pull **large weights**; cloud runs may require a **free RunComfy account**.
---
### Overview
ComfyUI Phantom is a unified video generation framework for single and multi-subject references, built on existing text-to-video and image-to-video architectures. It achieves cross-modal alignment using text-image-video triplet data by redesigning the joint text-image injection model. Additionally, it emphasizes subject consistency in human generation while enhancing ID-preserving video generation.
In simpler terms, ComfyUI Phantom allows you to generate videos based on reference images and a text prompt, making it perfect for identity-consistent human video synthesis.
### Notes
**ComfyUI Phantom | Subject to Video** — see [RunComfy page](https://www.runcomfy.com/comfyui-workflows/comfyui-phantom-subject-to-video?utm_source=civitai&utm_medium=referral) for the latest node requirements.

