IndexTTS2：最逼真的具有情感控制的开源语音克隆/TTS工具

官方文档

功能介绍

IndexTTS2 核心能力是 TTS（TextToSpeech），上传一段参考音频（用于参考音色） + 待阅读的文本 + 情感控制方式，可以使用参考音频的音色 + 情感控制的情感来阅读文本。相较于 IndexTTS1.5，最显著的能力就是增加了 4 种情感控制方式，测试下来，情感控制的很到位。

与音色参考音频相同

情感使用音色音频的情感。

使用情感参考音频

情感使用单独上传的音频的情感，同时可以指定情感权重（0-1.6）。

使用情感向量控制

情感使用选择八种情感（高兴, 愤怒, 悲伤, 害怕, 厌恶, 忧郁, 惊喜, 平静）的权重来控制。

使用情感描述文本控制

情感使用自然语言来控制。

安装应用

显存要求：测试下来需要 12.7G 显存。

下面以 Windows11 为例，演示安装流程。在 cmd 中依次输入以下命令

shell

git lfs install
git clone https://github.com/index-tts/index-tts.git
cd index-tts
git lfs pull  # download large repository files

conda create -n index-tts2_env python=3.10 -y
conda activate index-tts2_env

# uv 会安装在 index-tts2_env\Scripts\ 路径下
pip install -U uv
uv sync # install the correct versions of all dependencies into your .venv directory
uv sync --extra webui # 单独安装 webui

# 下载模型，首先在 https://huggingface.co/settings/tokens/new?tokenType=write 新建 token
uv tool install "huggingface_hub[cli]"
hf download IndexTeam/IndexTTS-2 --local-dir=checkpoints --token ${此处输入新建的 token}

# if you need to diagnose your environment to see which GPUs are detected, you can use our included utility to check your system:
uv run tools/gpu_check.py

# 启动应用
uv run webui.py

出现如下日志，则表示启动成功。

text

* Running on local URL:  http://0.0.0.0:7860

浏览器输入 http://127.0.0.1:7860 进行使用。

文章的最后，如果您觉得本文对您有用，请打赏一杯咖啡！感谢！

IndexTTS2：最逼真的具有情感控制的开源语音克隆/TTS工具 ​

功能介绍 ​

与音色参考音频相同 ​

使用情感参考音频 ​

使用情感向量控制 ​

使用情感描述文本控制 ​