You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a way to expose/control more settings, see these settings, ensure consistent output (reads)?
Sure! I'll expose seed as a controllable setting. The variability is a natural outcome of Tortoise (or any neural net based TTS) and seed will keep it consistent across generations for the same inputs.
Even with the same text (sentences) and settings, multiple generations result in radically different pacing, style, inflection.
Is there a way to expose/control more settings, see these settings, ensure consistent output (reads)?
This file contains audio of the same text with audio generated with the same models and settings: https://we.tl/t-0M8VeAMAt0
Note the differences in style, inflection, pronunciation, pacing.
The text was updated successfully, but these errors were encountered: