dGSLM is a project for textless speech modeling using Fairseq. This guide will help you set up the environment and install the necessary dependencies.
Follow the steps below to set up the environment using conda and pip:
conda create -n dGSLM python=3.9
conda activate dGSLM
Install PyTorch, torchvision, and torchaudio with CUDA 12.1 support:conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install pip==23.3.1
pip install omegaconf==2.0.6
Clone the Fairseq repository and install it in editable mode:
git clone [email protected]:facebookresearch/fairseq.git
cd fairseq
pip install --editable ./
pip install soundfile librosa
python src/inference.py