GitHub - irfanfadhullah/Sequence-to-Sequence-Video-Captioning-Reproduce

S2VT: Sequence to Sequence Video to Text

This is the PyTroch implementation of Sequence to Sequence Video to Text for training and testing model.

Preparing Tools needed

clone coco-caption repository https://github.com/tylin/coco-caption.git rename folder coco-caption to coco_caption

Installs

I use Python 3.6 in this project. Recommend installing pytorch and python packages using Anaconda.

PyTorch
Numpy
tqdm
pretrainedmodels
ffmpeg (can install using anaconda)

Datasets

Download YouTubeClips dataset from : https://www.cs.utexas.edu/users/ml/clamp/videoDescription/YouTubeClips.tar Extract YouTubeClips in folder ./data

Prepare Flow Data

Type in terminal (you can change with you directory path) :

python optical_flow.py --video_path [YouTubeClips path]

Extract Features and Corpus

Extract RGB Features using VGG-16 Type in terminal (you can change with you directory path) :

python extract_features.py --video_path [YouTubeClips path]  --features_path [ ./data/msvd_vgg16_bn]

Extract Flow Features using Alexnet Type in terminal (you can change with you directory path) :

python extract_features.py --video_path [OpticalFlow path] --features_path [ .\data\feats\msvd_alexnet_flow]

Preprocess Video Caption by run preproces_caption.py Note : you need to adjust the path within the code
```
python preproces_caption.py
```

Training

Edit directory in training_rgb_flow.py , and you can adjust with your own directory
For training the model, you can just type this in terminal :
```
python training_rgb_flow.py
```

evaluation

Edit directory in training_rgb_flow.py , and you can adjust with your own directory
For tetsing or evaluation the model, you can just type this in terminal :
```
python evaluation.py
```

Results

Scheme	METEOR	Bleu_4	ROUGUE_L	CIDEr
Original Paper	0.298	N/A	N/A	N/A
Baseline	0.2965893042	0.3152643061	0.6666146162	0.5998173136

Refferences

Sequence to Sequence - Video to Text
S. Venugopalan, M. Rohrbach, J. Donahue, T. Darrell, R. Mooney, K. Saenko
The IEEE International Conference on Computer Vision (ICCV) 2015

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
README.md		README.md
attention_model.py		attention_model.py
dataloader.py		dataloader.py
evaluation.py		evaluation.py
extract_features.py		extract_features.py
helper_function.py		helper_function.py
optical_flow.py		optical_flow.py
predict_json_formatter.py		predict_json_formatter.py
preproces_caption.py		preproces_caption.py
training_rgb_flow.py		training_rgb_flow.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

S2VT: Sequence to Sequence Video to Text

Preparing Tools needed

Installs

Datasets

Prepare Flow Data

Extract Features and Corpus

Training

evaluation

Results

Refferences

About

Releases

Packages

Languages

irfanfadhullah/Sequence-to-Sequence-Video-Captioning-Reproduce

Folders and files

Latest commit

History

Repository files navigation

S2VT: Sequence to Sequence Video to Text

Preparing Tools needed

Installs

Datasets

Prepare Flow Data

Extract Features and Corpus

Training

evaluation

Results

Refferences

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages