Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gesture Generation
Create the environment:
conda env create -f gesture2vec.yml
conda activate gesture2vec
Download fastText vectors:
mkdir resource
cd resource
wget https://dl.fbaipublicfiles.com/fasttext/vectors-english/crawl-300d-2M-subword.zip
unzip crawl-300d-2M-subword.zip
Make LMDB data for training set:
cd scripts
python trinity_data_to_lmdb.py /n/holylabs/LABS/kempner_fellows/Users/jennhu/GENEA_Challenge_2020_data_release/Training_data
This will create lmdb_train
and lmdb_test
, which should be treated as training and validation sets, respectively.
python train_DAE.py --config=../config/DAE_GENEA_jh.yml
Or submit the SLURM script:
sbatch train_DAE.batch
This is an official PyTorch implementation of Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gesture Generation (IROS 2022). In this paper, we present an automatic gesture generation model that uses a vector-quantized variational autoencoder structure as well as training techniques to learn a rigorous representation of gesture sequences. We then translate input text into a discrete sequence of associated gesture chunks in the learned gesture space. Subjective and objective evaluations confirm the success of our approach in terms of appropriateness, human-likeness, and diversity. We also introduce new objective metrics using the quantized gesture representation.
TODO
This code is distributed under an MIT LICENSE.
Note that our code uses datasets inluding Trinity and Talk With Hand (TWH) that each have their own respective licenses that must also be followed.
Please feel free to contact us ([email protected]) with any question or concerns.