Constant Memory WaveGlow

A PyTorch implementation of WaveGlow: A Flow-based Generative Network for Speech Synthesis using constant memory method described in Training Glow with Constant Memory Cost.

The model implementation details are slightly differed from the official implementation based on personal favor, and the project structure is brought from pytorch-template.

Besides, we also add implementations of Baidu's WaveFlow, and MelGlow, which are easier to train and more memory fiendly.

In addition to neural vocoder, we also add an implementation of audio super-resolution model WSRGlow.

Requirements

After install the requirements from pytorch-template:

pip install nnAudio torch_optimizer

Quick Start

Modify the data_dir in the json file to a directory which has a bunch of wave files with the same sampling rate, then your are good to go. The mel-spectrogram will be computed on the fly.

{
  "data_loader": {
    "type": "RandomWaveFileLoader",
    "args": {
      "data_dir": "/your/data/wave/files",
      "batch_size": 8,
      "num_workers": 2,
      "segment": 16000
    }
  }
}

python train.py -c config.json

Memory consumption of model training in PyTorch

Model	Memory (MB)
WaveGlow, channels=256, batch size=24 (naive)	N.A.
WaveGlow, channels=256, batch size=24 (efficient)	4951

Result

WaveGlow

I trained the model on some cello music pieces from MusicNet using the musicnet_config.json. The clips in the samples folder is what I got. Although the audio quality is not very good, it's possible to use WaveGlow on music generation as well. The generation speed is around 470kHz on a 1080ti.

WaveFlow

I trained on full LJ speech dataset using the waveflow_LJ_speech.json. The settings are corresponding to the 64 residual channels, h=64 model in the paper. After training about 1.25M steps, the audio quality is very similiar to their official examples. Samples generated from training data can be listened here.

MelGlow

Coming soon.

WSRGlow

Pre-trained models on VCTK dataset are available here. We follow the settings of NU-Wave to get the training data.

Citation

If you use our code on any project and research, please cite:

@misc{memwaveglow,
  doi          = {10.5281/zenodo.3874330},
  author       = {Chin Yun Yu},
  title        = {Constant Memory WaveGlow: A PyTorch implementation of WaveGlow with constant memory cost},
  howpublished = {\url{https://github.com/yoyololicon/constant-memory-waveglow}},
  year         = {2019}
}

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
.github		.github
configs		configs
datasets @ 5ae07b3		datasets @ 5ae07b3
model		model
samples		samples
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
inference.py		inference.py
test.py		test.py
train.py		train.py
utils.py		utils.py
vctk_wsrglow_infer.py		vctk_wsrglow_infer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Constant Memory WaveGlow

Requirements

Quick Start

Memory consumption of model training in PyTorch

Result

WaveGlow

WaveFlow

MelGlow

WSRGlow

Citation

About

Releases 4

Sponsor this project

Packages

Languages

yoyolicoris/constant-memory-waveglow

Folders and files

Latest commit

History

Repository files navigation

Constant Memory WaveGlow

Requirements

Quick Start

Memory consumption of model training in PyTorch

Result

WaveGlow

WaveFlow

MelGlow

WSRGlow

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases 4

Sponsor this project

Packages 0

Languages

Packages