-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation for voice training? #2
Comments
This would be helpful for the community to create models for languages which are not currently supported. Some direction would be helpful on how to structure data/hours of audio needed/scripts to run for training/save model for future use. |
No documentation yet. I'm currently rewriting the training code to be usable by people other than myself. It was written over the course of a year or more with different experiments and dead-ends left it. Definitely needs some clean up! The structure of the training data is very simple, currently just a CSV file with two columns: (1) path or name of the audio file, and (2) text transcription. For example:
If you have multiple speakers, it becomes:
Eventually, I'd like to use the data format from Mimic Recording Studio. Audio files can be anything that librosa will load. As for the amount of data, that depends if you're starting from scratch or will be reusing an existing model. From scratch, I've found that 3-5 hours will get you a good voice but 10+ will usually make a great voice. What really matters is the recording quality and phonetic diversity of what you read. If you reuse an existing model, I've had as little as 30 minutes of data work using the Harvard Sentences. I'd recommend at least an hour, though. |
Thanks for the quick reply. I am starting out to create a good quality TTS for Hindi so gathering info on what is required for a good dataset.
Few questions above maybe out of scope of this repo, but if you could help it would be great. |
|
@synesthesiam do you have an update on getting the training code ready to use? I am interested in using it as well. |
And it would be interesting to know, which model you are using or a reference to the paper ? |
@synesthesiam I'd also love to make a voice model (in english). From what you've said on this thread, I think I could get started, but I'm just wondering what I would need to do with the CSV file once I have it made. Or maybe I'm more wondering if you are getting close to finishing the cleaning up of the training code, since I bet that would be easier to use than forging ahead alone. Or maybe better still, if Mimic Recording Studio is close to being ready to use for Mimic 3. |
Hi,
Is there any documentation anywhere on how to train/create a new voice for this from e.g. audio collected by mimic-recording-studio?
Many thanks
Rob
The text was updated successfully, but these errors were encountered: