Skip to content

tabisheva/speech-recognition-quartznet

Repository files navigation

QuartzNet

A PyTorch implementation of QuartzNet, an End-to-End ASR on LJSpeech dataset.

Usage

Set preferred configurations in config.py and run ./run_docker.sh (don't forget about correct volume option)

Training

wget https://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2   # download data
tar xjf LJSpeech-1.1                                              # unzip data
python train.py

You will need to log in to your account in wandb.ai for monitoring logs.

Every 10'th checkpoint after 40'th epoch will be saved in model{epoch}.pth.

Inference

Set path_to_file with .wav in config, from_pretrained=True, then

python inference.py

The result will be saved in path_to_file.txt

About

Speech recognition for English using Pytorch

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published