BCDA

Usage

Pretraining for the Voice Activity Projection Model:

>>> python vap_train.py data/conf/vap_pretrained_model-best.yaml

Training for the Backchannel Prediction Model:

>>> python bcda_train.py data/conf/whole.yaml --pretrained data/model/vap_pretrained_model-best.ckpt

All configurations used in the paper are found in the folder data/conf. Trained models will be saved to /data/model_checkpoints per default.

Evaluating a Backchannel Prediction Model, e.g.:

>>> python bcda_eval.py data/conf/whole.yaml data/model_checkpoints/checkpoint-epoch=3-step=50256.ckpt

Set Up

Step 0:

Create and activate a new clean conda environment:

>>> conda create -n myenv python=3.9
>>> conda activate myenv

Step 1:

In order to install the appropraite pytorch version, first find your CUDA version: In Windows Powershell or Linux standard terminal:

>>> nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 516.94       CUDA Version: **11.7**     |
|-------------------------------+----------------------+----------------------+
...

Step 2:

Find your matching pytorch version and copy the comand from: https://pytorch.org/get-started/previous-versions/ There is no version for CUDA 11.7 listed, but we make use of backwards compatibility and install:

>>> conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge

Step 3

Install all standard dependencies:

>>> pip install -r requirements.txt

Step 4:

Manually install the contrastive predictive wave encoding model:

>>> pip install git+https://github.com/facebookresearch/CPC_audio.git

The download may fail and ask you to install the C++ development tool with Visual Studio. Do so. Afterwards execute the above command again.

Step 5:

Prepare the data. You will need the freely available transcripts as well as the licenced audio data.
Download transcripts from: https://www.openslr.org/resources/5/switchboard_word_alignments.tar.gz
Buy audio files from: https://catalog.ldc.upenn.edu/LDC97S62
Finally, place all data into the subfolder data/swb. The folder structure should look like this:

data/swb/
  -> swb_audios
        -> sw02001.sph
        -> sw02005.sph
        -> ...
        -> sw04940.sph
  -> swb_ms98_transcriptions
        -> 20/
                -> 2001/
                -> ...
        -> 21/
        -> ...
        -> 49/

Step 6:

In order to run the testsuite, read the instructions in tests/data/ Finally, check whether the code is running ordlerly throug: python -m pytest

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
baseline		baseline
bcda		bcda
data		data
sparta		sparta
tests		tests
vap		vap
.gitignore		.gitignore
README.md		README.md
bcda_eval.py		bcda_eval.py
bcda_train.py		bcda_train.py
model_architecture.png		model_architecture.png
requirements.txt		requirements.txt
vap_eval.py		vap_eval.py
vap_train.py		vap_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BCDA

Usage

Set Up

Step 0:

Step 1:

Step 2:

Step 3

Step 4:

Step 5:

Step 6:

About

Releases

Packages

Languages

owuQQQ/BCDA

Folders and files

Latest commit

History

Repository files navigation

BCDA

Usage

Set Up

Step 0:

Step 1:

Step 2:

Step 3

Step 4:

Step 5:

Step 6:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages