To train a new voice for English, how many hours of audio do you recommend? #194

xiao1ongbao · 2024-09-27T02:42:40Z

To train a new voice for English, how many hours of audio do you recommend?
Does the training script train from scratch or finetunes the existing model?
Thanks!

iv2985 · 2024-10-27T09:47:45Z

If one takes the G_0.pth (the first checkpoint) during training and uses it for inference, it speaks English with a young female voice that doesn't match the audio clips being trained on. So, it seems that it is fine-tuning that starting point.

As for duration of audio, I have gotten reasonable results with only 5 minutes of audio and 1k epochs with 48khz wav. Most people use 1+ hours, however.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

To train a new voice for English, how many hours of audio do you recommend? #194

To train a new voice for English, how many hours of audio do you recommend? #194

xiao1ongbao commented Sep 27, 2024

iv2985 commented Oct 27, 2024

To train a new voice for English, how many hours of audio do you recommend? #194

To train a new voice for English, how many hours of audio do you recommend? #194

Comments

xiao1ongbao commented Sep 27, 2024

iv2985 commented Oct 27, 2024