Meaninless Speech Output #2827

mesut92 · 2023-08-01T05:37:07Z

mesut92
Aug 1, 2023

Hi
I have a transcribed Turkish Audiobook. And I am trying to train a TTS. Using recipes/ljspeech/train_vits.py . It is about 5 hours. And I trained 1000 epochs, 200000 steps. It sounds like human, but speeches are meaningless. I am trying to understand why for one week, but i could not fix it. I did not initialized with anything. @erogol

here is the my config
config = VitsConfig(
audio=audio_config,
run_name="vits_ljspeech",
batch_size=32,
eval_batch_size=16,
batch_group_size=5,
num_loader_workers=8,
num_eval_loader_workers=4,
run_eval=True,
test_delay_epochs=-1,
epochs=1000,
text_cleaner="phoneme_cleaners",
phonemizer="espeak",
use_phonemes=True,
phoneme_language="tr",
phoneme_cache_path=os.path.join(output_path, "phoneme_cache"),
compute_input_seq_cache=True,
print_step=25,
print_eval=True,
mixed_precision=True,
output_path=output_path,
datasets=[dataset_config],
cudnn_benchmark=False,
)

And my logs;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meaninless Speech Output #2827

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Meaninless Speech Output #2827

mesut92 Aug 1, 2023

Replies: 0 comments

mesut92
Aug 1, 2023