The ultimate VITS2

The idea for this repo is to implement the most comprehensive VITS2 out here.

Changelist

pre-requisites

Python >= 3.8
CUDA
Pytorch version 1.13.1 (+cu117)
Clone this repository
Install python requirements.
```
pip install -r requirements.txt
```
If you want to proceed with those cleaned texts in filelists, you need to install espeak.
```
apt-get install espeak
```
Prepare datasets & configuration
1. wav files (22050Hz Mono, PCM-16)
2. Prepare text files. One for training^(ex) and one for validation^(ex). Split your dataset to each files. As shown in these examples, the datasets in validation file should be fewer than the training one, while being unique from those of training text.
  - Single speaker^(ex)
```
wavfile_path|transcript
```
  - Multi speaker^(ex)
```
wavfile_path|speaker_id|transcript
```
3. Run preprocessing with a cleaner of your interest. You may change the symbols as well.
  - Single speaker
```
python preprocess.py --text_index 1 --filelists PATH_TO_train.txt --text_cleaners CLEANER_NAME
python preprocess.py --text_index 1 --filelists PATH_TO_val.txt --text_cleaners CLEANER_NAME
```
  - Multi speaker
```
python preprocess.py --text_index 2 --filelists PATH_TO_train.txt --text_cleaners CLEANER_NAME
python preprocess.py --text_index 2 --filelists PATH_TO_val.txt --text_cleaners CLEANER_NAME
```
  The resulting cleaned text would be like this(single). ^{ex - multi}
Build Monotonic Alignment Search.

# Cython-version Monotonoic Alignment Search
cd monotonic_align
mkdir monotonic_align
python setup.py build_ext --inplace

Edit configurations based on files and cleaners you used.

Setting json file in configs

Model	How to set up json file in configs	Sample of json file configuration
iSTFT-VITS2	`"istft_vits": true,` `"upsample_rates": [8,8],`	istft_vits2_base.json
MB-iSTFT-VITS2	`"subbands": 4,` `"mb_istft_vits": true,` `"upsample_rates": [4,4],`	mb_istft_vits2_base.json
MS-iSTFT-VITS2	`"subbands": 4,` `"ms_istft_vits": true,` `"upsample_rates": [4,4],`	ms_istft_vits2_base.json
Mini-iSTFT-VITS2	`"istft_vits": true,` `"upsample_rates": [8,8],` `"hidden_channels": 96,` `"n_layers": 3,`	mini_istft_vits2_base.json
Mini-MB-iSTFT-VITS2	`"subbands": 4,` `"mb_istft_vits": true,` `"upsample_rates": [4,4],` `"hidden_channels": 96,` `"n_layers": 3,` `"upsample_initial_channel": 256,`	mini_mb_istft_vits2_base.json

Training Example

# train_ms.py for multi speaker
# train_l.py to use Lightning
python train_ms.py -c configs/shergin_d_vector_hfg.json -m models/test

Contact

If you have any questions regarding how to run it, contact us in Telegram

https://t.me/voice_stuff_chat

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
configs		configs
filelists		filelists
monotonic_align		monotonic_align
resources		resources
text		text
LICENSE		LICENSE
README.md		README.md
attentions.py		attentions.py
commons.py		commons.py
data_utils.py		data_utils.py
inference.ipynb		inference.ipynb
inference.py		inference.py
losses.py		losses.py
mel_processing.py		mel_processing.py
models.py		models.py
modules.py		modules.py
onnx_export.py		onnx_export.py
pqmf.py		pqmf.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
stft.py		stft.py
stft_loss.py		stft_loss.py
train.py		train.py
train_l.py		train_l.py
train_ms.py		train_ms.py
training_colab.ipynb		training_colab.ipynb
transforms.py		transforms.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The ultimate VITS2

Changelist

pre-requisites

Setting json file in configs

Training Example

Contact

Credits

About

Releases

Packages

Languages

License

shigabeev/Q-VITS2-Voice-Cloning

Folders and files

Latest commit

History

Repository files navigation

The ultimate VITS2

Changelist

pre-requisites

Setting json file in configs

Training Example

Contact

Credits

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages