ASR: Is there any checked and stable way for pretrain? #11813

ican24 · 2025-01-10T07:45:16Z

Hello,
I am trying to develop ASR tools for 2 custom languages.
I used 3-4 times speech_to_text_ctc_bpe.py for pretrain, but all cases I did not receive reslult. My best results is WER:0.98.

  
python speech_to_text_ctc_bpe.py \
	+config-name=../conf/fastconformer/fast-conformer_ctc_bpe \
	model.train_ds.manifest_filepath=/nemoasr/train.5 \
	model.validation_ds.manifest_filepath=/nemoasr/test.5 \
	model.tokenizer.dir=/nemoasr/tokenizer/tokenizer_spe_unigram_v512 \
	model.tokenizer.type=bpe \
	trainer.devices=-1 \
	trainer.accelerator="gpu" \
	trainer.strategy="ddp" \
	trainer.max_epochs=100 \
	model.optim.name="adamw" \
	model.optim.lr=0.001 \
	model.optim.betas=[0.9,0.999] \
	model.optim.weight_decay=0.0001 \
	model.optim.sched.warmup_steps=2000 \

Meanwhile I can get sufficient resluts (WER:3.5-4.5) using script. which I generated from examples in open sources of Internet (Japanese, Vietnamese).

Actually I would not to know is there any checked way for pretrain?
Thank you advance!

The text was updated successfully, but these errors were encountered:

ican24 · 2025-01-15T04:27:25Z

No one knows pretrain mechanism???
It is so-so strange

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASR: Is there any checked and stable way for pretrain? #11813

ASR: Is there any checked and stable way for pretrain? #11813

ican24 commented Jan 10, 2025 •

edited

Loading

ican24 commented Jan 15, 2025

ASR: Is there any checked and stable way for pretrain? #11813

ASR: Is there any checked and stable way for pretrain? #11813

Comments

ican24 commented Jan 10, 2025 • edited Loading

ican24 commented Jan 15, 2025

ican24 commented Jan 10, 2025 •

edited

Loading