Sign In

IndexTTS2_ Vocal and Emotional Transfer _ Two person Dialogue+Single person Speaking Workflow

6

48

2

Type

Workflows

Stats

48

0

Reviews

Published

Oct 16, 2025

Base Model

Other

Hash

AutoV2
B4BC82E479

You can click on the link below to try it out directly. If the effect is good, you can deploy it locally

https://www.runninghub.ai/post/1968294270253838337/?inviteCode=sdhs0trb

Fan benefits,register to get 1000 points,daily login 100 points,play 4090!Experience the super power of 48G.

https://buymeacoffee.com/a592991299o

This is a workflow for replicating human voices and emotions, which can generate emotional audio of single person speech or two person conversation. Better to use than previous models that generate stiff vocals, strongly recommended. The deployment difficulty of ComfyUI is relatively high. Firstly, the transformer version needs to be 4.51.0; Ensure the presence of the JSON5 module.
Project page: https://github.com/billwuhao/ComfyUI_IndexTTS
Model download link:
https://hf-mirror.com/nvidia/bigvgan_v2_22khz_80band_256x/tree/main
https://hf-mirror.com/funasr/campplus/tree/main
https://hf-mirror.com/IndexTeam/IndexTTS-2/tree/main
https://hf-mirror.com/amphion/MaskGCT/tree/main/semantic_codec
https://hf-mirror.com/facebook/w2v-bert-2.0/tree/main
Model placement structure:

- bigvgan_v2_22khz_80band_256x
bigvgan_generator.pt
config.json
- campplus
campplus_cn_common.bin
- IndexTTS-2
│ .gitattributes
│ bpe.model
│ config.yaml
feat1.pt
feat2.pt
│ gpt.pth
README.md
│ s2mel.pth
│ wav2vec2bert_stats.pt

└─ qwen0.6bemo4-merge
added_tokens.json
chat_template.jinja
config.json
generation_config.json
merges.txt
model.safetensors
Modelfile
special_tokens_map.json
tokenizer.json
tokenizer_config.json
vocab.json
- MaskGCT
semantic_codec
model.safetensors
- w2v-bert-2.0
.gitattributes
config.json
conformer_shaw.pt
model.safetensors
preprocessor_config.json
README.md