Real Time Voice Cloning

2025-12-11 0 587

Real-Time Voice Cloning

This repository is an implementation of Transfer Learning from Speaker Verification to
Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in realtime. This was my master\’s thesis.

SV2TTS is a deep learning framework in three stages. In the first stage, one creates a digital representation of a voice from a few seconds of audio. In the second and third stages, this representation is used as reference to generate speech given arbitrary text.

Video demonstration (click the picture):

Papers implemented

URL Designation Title Implementation source
1806.04558 SV2TTS Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis This repo
1802.08435 WaveRNN (vocoder) Efficient Neural Audio Synthesis fatchord/WaveRNN
1703.10135 Tacotron (synthesizer) Tacotron: Towards End-to-End Speech Synthesis fatchord/WaveRNN
1710.10467 GE2E (encoder) Generalized End-To-End Loss for Speaker Verification This repo

Heads up

Like everything else in Deep Learning, this repo has quickly gotten old. Many SaaS apps (often paying) will give you a better audio quality than this repository will. If you wish for an open-source solution with a high voice quality:

  • Check out paperswithcode for other repositories and recent research in the field of speech synthesis.
  • Check out Chatterbox for a similar project up to date with the 2025 SOTA in voice cloning

Setup

1. Install Requirements

  1. Both Windows and Linux are supported. A GPU is recommended for training and for inference speed, but is not mandatory.
  2. Python 3.7 is recommended. Python 3.5 or greater should work, but you\’ll probably have to tweak the dependencies\’ versions. I recommend setting up a virtual environment using venv, but this is optional.
  3. Install ffmpeg. This is necessary for reading audio files.
  4. Install PyTorch. Pick the latest stable version, your operating system, your package manager (pip by default) and finally pick any of the proposed CUDA versions if you have a GPU, otherwise pick CPU. Run the given command.
  5. Install the remaining requirements with pip install -r requirements.txt

2. (Optional) Download Pretrained Models

Pretrained models are now downloaded automatically. If this doesn\’t work for you, you can manually download them here.

3. (Optional) Test Configuration

Before you download any dataset, you can begin by testing your configuration with:

python demo_cli.py

If all tests pass, you\’re good to go.

4. (Optional) Download Datasets

For playing with the toolbox alone, I only recommend downloading LibriSpeech/train-clean-100. Extract the contents as <datasets_root>/LibriSpeech/train-clean-100 where <datasets_root> is a directory of your choosing. Other datasets are supported in the toolbox, see here. You\’re free not to download any dataset, but then you will need your own data as audio files or you will have to record it with the toolbox.

5. Launch the Toolbox

You can then try the toolbox:

python demo_toolbox.py -d <datasets_root>
or
python demo_toolbox.py

depending on whether you downloaded any datasets. If you are running an X-server or if you have the error Aborted (core dumped), see this issue.

下载源码

通过命令行克隆项目:

git clone https://github.com/CorentinJ/Real-Time-Voice-Cloning.git

收藏 (0) 打赏

感谢您的支持,我会继续努力的!

打开微信/支付宝扫一扫,即可进行扫码打赏哦,分享从这里开始,精彩与您同在
点赞 (0)

申明:本文由第三方发布,内容仅代表作者观点,与本网站无关。对本文以及其中全部或者部分内容的真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。本网发布或转载文章出于传递更多信息之目的,并不意味着赞同其观点或证实其描述,也不代表本网对其真实性负责。

左子网 编程相关 Real Time Voice Cloning https://www.zuozi.net/34504.html

LLMs from scratch
上一篇: LLMs from scratch
yolov5
下一篇: yolov5
常见问题
  • 1、自动:拍下后,点击(下载)链接即可下载;2、手动:拍下后,联系卖家发放即可或者联系官方找开发者发货。
查看详情
  • 1、源码默认交易周期:手动发货商品为1-3天,并且用户付款金额将会进入平台担保直到交易完成或者3-7天即可发放,如遇纠纷无限期延长收款金额直至纠纷解决或者退款!;
查看详情
  • 1、描述:源码描述(含标题)与实际源码不一致的(例:货不对板); 2、演示:有演示站时,与实际源码小于95%一致的(但描述中有”不保证完全一样、有变化的可能性”类似显著声明的除外); 3、发货:不发货可无理由退款; 4、安装:免费提供安装服务的源码但卖家不履行的; 5、收费:价格虚标,额外收取其他费用的(但描述中有显著声明或双方交易前有商定的除外); 6、其他:如质量方面的硬性常规问题BUG等。 注:经核实符合上述任一,均支持退款,但卖家予以积极解决问题则除外。
查看详情
  • 1、左子会对双方交易的过程及交易商品的快照进行永久存档,以确保交易的真实、有效、安全! 2、左子无法对如“永久包更新”、“永久技术支持”等类似交易之后的商家承诺做担保,请买家自行鉴别; 3、在源码同时有网站演示与图片演示,且站演与图演不一致时,默认按图演作为纠纷评判依据(特别声明或有商定除外); 4、在没有”无任何正当退款依据”的前提下,商品写有”一旦售出,概不支持退款”等类似的声明,视为无效声明; 5、在未拍下前,双方在QQ上所商定的交易内容,亦可成为纠纷评判依据(商定与描述冲突时,商定为准); 6、因聊天记录可作为纠纷评判依据,故双方联系时,只与对方在左子上所留的QQ、手机号沟通,以防对方不承认自我承诺。 7、虽然交易产生纠纷的几率很小,但一定要保留如聊天记录、手机短信等这样的重要信息,以防产生纠纷时便于左子介入快速处理。
查看详情

相关文章

猜你喜欢
发表评论
暂无评论
官方客服团队

为您解决烦忧 - 24小时在线 专业服务