home models images videos 3D Models articles comics challenges updates shop

LTX 2.3 Lip-Sync Workflow – 3min for 10s video, walk&talk supported

Name: LTX 2.3 Lip-Sync Workflow – 3min for 10s video, walk&talk supported
Rating: 5 (5 reviews)
Author: Adventurer17

238

Updated: Apr 7, 2026

base model

Download

1 variant available

Archive Other

LTX2.3 audio ref workflow.zip

33.5 KB

Verified: 3 months ago

Download (33.5 KB)

Details

Type

Workflows

Stats

238

Reviews

Positive

(5)

Published

Apr 7, 2026

Base Model

LTXV 2.3

Hash

AutoV2

1FE252F9BC

default creator card background decoration

152

Adventurer17

Joined Nov 13, 2025

License:

LTXV2

Click here to try online first:

Workflow: Lip-Sync Speaking/Singing – LTX2.3 Image-to-Digital Human – Auto Expansion – Module Optimization – No Subtitles

Experience link: https://www.runninghub.ai/post/2038618856104665090/?inviteCode=rh-v1401

Workflow: Text-to-Lip-Sync Video – Speaking/Singing – LTX2.3 Text-to-Digital Human – No Subtitles – Module Optimization

Experience link: https://www.runninghub.ai/post/2038618886479814658/?inviteCode=rh-v1401

Workflow: LTX2.3 – Fully Automated Prompt – Text-to-Video

Experience link: https://www.runninghub.ai/post/2031218445026594817/?inviteCode=rh-v1401

Workflow: LTX2.3 – Fully Automated Prompt – Image-to-Video – Modular Tuned Edition

Experience link: https://www.runninghub.ai/post/2031218459471777794/?inviteCode=rh-v1401

Workflow: LTX2.3 – Fully Automated Prompt – First/Middle/Last Frame Three-Image-to-Video

Experience link: https://www.runninghub.ai/post/2035325465820405761/?inviteCode=rh-v1401

Name: LTX 2.3 Image-to-Lip-Sync Meme Workflow (Modular / Ultra-Fast / Action-Supported)

【Name】

LTX 2.3 图生对口型鬼畜工作流（模块化/超快/支持动作）

Introduction:

Built on the open-source LTX 2.3 model, optimized for image-to-lip-sync videos. It allows any image (people/animals/medium-close-up) to accurately sing or speak along with the uploaded audio, while controlling actions (walking, waving, jumping, etc.) via prompts.

【简介】

基于LTX 2.3开源模型打造，专为图生对口型视频优化。可让任意图片（人物/动物/中近景）随着上传的音频精准唱歌或说话，同时通过提示词控制动作（走路、挥手、跳跃等）。

Core Advantages:

- Extremely fast: a 10-second 1280-resolution video takes only 3-6 minutes; even faster on second run

- Batch 5x: tested running 5 workflows simultaneously, producing a dozen finished videos per day

- Modular grouping: upload → dimension setting → audio → Latent creation → upscale; clear and easy to modify

- With fixed shots, it's almost impossible to tell generated clips from original; perfect for memes/entertainment/vtubers

- Supports MP3 audio (if error occurs, re-export once from CapCut)

- Avoid prompts like "look down" or "turn around" as they break character consistency

【核心优势】

- 速度极快：1280分辨率10秒视频仅需3~6分钟，工作流第二次运行更快

- 5开批量：实测同时跑5个工作流，一天产出十几个成品

- 模块化分组：上传→尺寸设置→音频→Latent创建→放大，一目了然，易于修改

- 固定镜头下几乎无法分辨生成与原片，适合鬼畜/娱乐/虚拟主播

- 支持MP3音频（如遇报错，用剪映重新导出一次即可）

- 避免提示词：低头、转身等会破坏人物一致性

Workflow Structure:

1. Upload image (medium/close-up, clear lip movements)

2. Set dimensions (longest side 1280)

3. Upload audio (10-15 seconds recommended)

4. Latent module references both image and audio, scaling at the same time

5. Final upscale and output

【工作流结构】

1. 上传图片（中近景，口型清晰）

2. 尺寸设置（最长边1280）

3. 上传音频（推荐10~15秒）

4. Latent模块参考图片+音频，同时缩放

5. 最终放大出片

Results Showcase:

This workflow has been used to create the "round-headed elderly meme singing" video (see example). Speaking lip-sync is equally excellent; paired with Qianwen voice design, it can be used for digital humans.

【效果展示】

已用本工作流制作“圆头耄耋魔性唱歌”鬼畜视频（见示例）。说话对口型同样优秀，配合千问声音设计可做数字人。

Note:

LTX2.3 is the open-source model closest to cinema-grade in texture and color control.

【注意】

LTX2.3是开源模型中质感、色彩控制最接近影视级的模型。