TTS-RVC-API
是的,我们可以将Coqui与RVC一起使用!
为什么将这两个框架结合在一起?
Coqui是文本到语音框架(Vocoder和编码器),但是克隆自己的声音需要数十年,并且不能保证更好的结果。这就是为什么我们使用RVC(基于检索的语音转换),这仅适用于语音到语音。您可以使用Hubert(一种预先训练的模型快速微调并提供更好的结果),因此只需2-3分钟的数据集训练模型。
安装
如何使用Coqui + RVC API?
https : // github . com / skshadan / TTS - RVC - API . git
python - m venv . venv . . venv / bin / activate pip install - r requirements . txt pip install TTS python - m uvicorn app . main : app
现在,用相对路径更新config.toml model_dir路径或在请求主体中设置speaker_name
RVC V2模型安装在容器上的位置:
/ └── models └── speaker1 ├── speaker1 . pth └── speaker1 . index
现在运行此
python - m uvicorn app . main : app
发表请求
http : // localhost : 8000 / generate
emotions : happy , sad , angry , dull speed = 1.0 - 2.0
{
\"speaker_name\" : \"speaker3\" ,
\"input_text\" : \"Hey there! Welcome to the world\" ,
\"emotion\" : \"Surprise\" ,
\"speed\" : 1.0
}
代码段
import requests import json import time url = \"http://127.*0*.*0.1:8000/generate\" payload = json . dumps ({ \"speaker_name\" : \"speaker3\" , \"input_text\" : \"Are you mad? The way you\'ve betrayed me is beyond comprehension, a slap in the face that\'s left me boiling with an anger so intense it\'s as if you\'ve thrown gasoline on a fire, utterly destroying any trust that was left.\" , \"emotion\" : \"Dull\" , \"speed\" : 1.0 }) headers = { \'Content-Type\' : \'application/json\' } start_time = time . time () # Start the timer response = requests . request ( \"POST\" , url , headers = headers , data = payload ) end_time = time . time () # Stop the timer if response . status_code == 200 : audio_content = response . content # Save the audio to a file with open ( \"generated_audio.wav\" , \"wb\" ) as audio_file : audio_file . write ( audio_content ) print ( \"Audio saved successfully.\" ) print ( \"Time taken:\" , end_time - start_time , \"seconds\" ) else : print ( \"Error:\" , response . text )
反馈
如果您有任何反馈,请联系shadankhantech@gmail.com
