Sign In

LTX-2 ControlNet in ComfyUI | Depth-Controlled Video Workflow

Updated: Apr 1, 2026

toolnew

Type

Workflows

Stats

53

0

Reviews

Published

Apr 1, 2026

Base Model

LTXV2

Hash

AutoV2
FDF7AF6FD3
default creator card background decoration
RunComfy's Avatar

RunComfy

Sharp control, perfect sync, super clear AI video creation.

Who it's for: creators who want this pipeline in ComfyUI without assembling nodes from scratch. Not for: one-click results with zero tuning — you still choose inputs, prompts, and settings.

Open preloaded workflow on RunComfy

Open preloaded workflow on RunComfy (browser)

Why RunComfy first
- Fewer missing-node surprises — run the graph in a managed environment before you mirror it locally.
- Quick GPU tryout — useful if your local VRAM or install time is the bottleneck.
- Matches the published JSON — the zip follows the same runnable workflow you can open on RunComfy.

When downloading for local ComfyUI makes sense — you want full control over models on disk, batch scripting, or offline runs.

How to use (local ComfyUI)
1. Load inputs (images/video/audio) in the marked loader nodes.
2. Set prompts, resolution, and seeds; start with a short test run.
3. Export from the Save / Write nodes shown in the graph.

Expectations — First run may pull large weights; cloud runs may require a free RunComfy account.


Overview

This ControlNet-powered LTX-2 workflow enables highly accurate video generation guided by explicit structural conditions such as depth maps, canny edges, and human poses. By using ControlNet-style IC LoRA conditioning, it enforces strong spatial and motion constraints across all frames while generating synchronized audio and visuals in a unified latent space. The workflow supports text-to-video, image-to-video, and video-to-video pipelines, allowing creators to precisely control scene structure, movement, and continuity. Its two-stage architecture provides efficient upscaling and optimized memory usage, making it ideal for refined, controllable, and production-ready video synthesis.

Important nodes:

Key nodes in Comfyui LTX-2 ControlNet workflow

  • LTXVAddGuide (#132)

  • Merges text conditioning and IC LoRA controls into the AV latent, acting as the heart of LTX-2 ControlNet guidance. Adjust only the few controls that matter: choose the control LoRA that matches your path (depth, canny, or pose) and, when available, the image_strength that tunes how tightly the model follows guides. Reference implementation and node behavior are provided by the LTXVideo extension. Docs/Code

  • LTXVImgToVideoInplace (#149, #155)

  • Injects a first-frame image into the AV latent for consistent scene initialization. Use strength to balance faithfulness to the first frame versus freedom to evolve; keep it lower for more motion and higher for tighter anchors. Bypass it when you want purely text- or control-driven openings. Docs/Code

  • LTXVScheduler (#95)

  • Drives the denoising trajectory for the unified latent so both audio and video converge together. Increase steps for complex scenes and fine detail; shorten for drafts and quick iteration. Schedule settings interact with guidance strength, so avoid extreme values when guidance is strong. Docs/Code

  • LTXVLatentUpsampler (#112)

  • Performs the second-stage latent upscaling with the LTX-2 x2 spatial upscaler, improving sharpness with minimal VRAM growth. Use it after the first pass rather than increasing base resolution to keep iterations responsive. Upscaler model

  • DWPreprocessor (#158)

  • Generates clean human pose keypoints for the pose-control path. Verify detections with the preview; if hands or small limbs are noisy, scale inputs to a moderate max dimension before preprocessing. Provided by the ControlNet auxiliary suite. Repo

Notes

LTX-2 ControlNet in ComfyUI | Depth-Controlled Video Workflow — see RunComfy page for the latest node requirements.