Skip to content

sharpian/deepride

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

DeepRide: Dashcam Video Description Dataset for Autonomous Vehicle Location-Aware Trip Description

Graphical Abstract

Video description is one of the most challenging task in the combined domain of computer vision and natural language processing. Captions for various open and constrained domain videos have been generated in the recent past but descriptions for driving dashcam videos have never been explored to the best of our knowledge. With the aim to explore dashcam video description generation for autonomous driving, this study presents DeepRide: a large-scale dashcam driving video description dataset for location-aware dense video description generation. The human-described dataset comprises visual scenes and actions with diverse weather, people, objects, and geographical paradigms. It bridges the autonomous driving domain with video description by textual description generation of the visual information as seen by a dashcam. We describe 16,000 videos (40 seconds each) in English employing 2,700 man-hours by two highly qualified teams with domain knowledge. The descriptions consist of eight to ten sentences covering each dashcam video’s global features and event features in 60 to 90 words. The dataset consists of more than 130K sentences, totaling approximately one million words. We evaluate the dataset by employing location aware vision-language recurrent transformer framework to elaborate on the efficacy and significance of the visio-linguistics research for autonomous vehicles. We provided base line results to evaluate the dataset by employing three existing state-of-the-art recurrent models. The memory augmented transformer performed superior due to its highly summarized memory state for visual information and the sentence history while generating the trip description. Our proposed dataset opens a new dimension of diverse and exciting applications, such as self-driving vehicle reporting, driver and vehicle safety, inter-vehicle road intelligence sharing, and travel occurrence reports.

Graphical Abstract

Citation :

G. Rafiq, M. Rafiq, B. -W. On, M. Sung and G. S. Choi, "DeepRide: Dashcam Video Description Dataset for Autonomous Vehicle Location-Aware Trip Description," in IEEE Access, vol. 10, pp. 107361-107375, 2022, doi: 10.1109/ACCESS.2022.3212745.

@ARTICLE{9913431, author={Rafiq, Ghazala and Rafiq, Muhammad and On, Byung-Won and Sung, Mankyu and Choi, Gyu Sang}, journal={IEEE Access}, title={DeepRide: Dashcam Video Description Dataset for Autonomous Vehicle Location-Aware Trip Description}, year={2022}, volume={10}, number={}, pages={107361-107375}, doi={10.1109/ACCESS.2022.3212745}}

Link to article

https://ieeexplore.ieee.org/document/9913431

Link to dataset

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published