Skip to content

Latest commit

 

History

History
69 lines (56 loc) · 2.36 KB

README.md

File metadata and controls

69 lines (56 loc) · 2.36 KB

vid2vid

modifed version of vid2vid for Speech2Video and Text2Video.

Setup

  1. git clone
git clone [email protected]:sibozhang/vid2vid.git
  1. setup env torchvision need to be 0.2.2 to be compatible with torch 0.4.1
python3 -m venv ../venv/vid2vid
source ../venv/vid2vid/bin/activate
pip install --upgrade pip
pip3 install https://download.pytorch.org/whl/cu92/torch-0.4.1-cp36-cp36m-linux_x86_64.whl 
pip install torchvision==0.2.2 
pip install numpy
pip install dominate requests
pip install pillow
pip install opencv-python 
pip install scipy 
pip install pytz

Trained model

Please build 'checkpoints' folder in the current folder and put trained model in it.

VidTIMIT fadg0 (English, Female) Dropbox

百度云链接:https://pan.baidu.com/s/1L1cvqwLu_uqN2cbW-bDgdA 密码:hygt

Xuesong (Chinese, Male) Dropbox

百度云链接:https://pan.baidu.com/s/1lhYRakZLnkQ8nqMuLJt_dA 密码:40ob

Q&A

  1. Get vid2vid working
*cd vid2vid/models/flownet2_pytorch
*export CUDA_HOME=/tools/cuda-9.2.88/
*comment “--user” in /flownet2_pytorch/install.sh //so it will install to python under venv, otherwise it install to .local
*bash install.sh
  1. Q: File "/mnt/scratch/sibo/vid2vid/util/util.py", line 62, in tensor2im image_numpy = image_tensor.cpu().float().numpy() RuntimeError: PyTorch was compiled without NumPy support

A: pip install torch==0.4.1.post2

Citation

Speech2Video Synthesis with 3D Skeleton Regularization and Expressive Body Poses

Miao Liao*, Sibo Zhang*, Peng Wang, Hao Zhu, Xinxin Zuo, Ruigang Yang. PDF Result Video 1 min Spotlight 10 min Presentation

@inproceedings{liao2020speech2video,
  title={Speech2video synthesis with 3D skeleton regularization and expressive body poses},
  author={Liao, Miao and Zhang, Sibo and Wang, Peng and Zhu, Hao and Zuo, Xinxin and Yang, Ruigang},
  booktitle={Proceedings of the Asian Conference on Computer Vision},
  year={2020}
}

Ackowledgements

This code is based on the vid2vid framework.