DeepRide: Dashcam Video Description Dataset for Autonomous Vehicle Location-Aware Trip Description

Video description is one of the most challenging task in the combined domain of computer vision and natural language processing. Captions for various open and constrained domain videos have been generated in the recent past but descriptions for driving dashcam videos have never been explored to the best of our knowledge. With the aim to explore dashcam video description generation for autonomous driving, this study presents DeepRide: a large-scale dashcam driving video description dataset for location-aware dense video description generation. The human-described dataset comprises visual scenes and actions with diverse weather, people, objects, and geographical paradigms. It bridges the autonomous driving domain with video description by textual description generation of the visual information as seen by a dashcam. We describe 16,000 videos (40 seconds each) in English employing 2,700 man-hours by two highly qualified teams with domain knowledge. The descriptions consist of eight to ten sentences covering each dashcam video’s global features and event features in 60 to 90 words. The dataset consists of more than 130K sentences, totaling approximately one million words. We evaluate the dataset by employing location aware vision-language recurrent transformer framework to elaborate on the efficacy and significance of the visio-linguistics research for autonomous vehicles. We provided base line results to evaluate the dataset by employing three existing state-of-the-art recurrent models. The memory augmented transformer performed superior due to its highly summarized memory state for visual information and the sentence history while generating the trip description. Our proposed dataset opens a new dimension of diverse and exciting applications, such as self-driving vehicle reporting, driver and vehicle safety, inter-vehicle road intelligence sharing, and travel occurrence reports.

Citation :

G. Rafiq, M. Rafiq, B. -W. On, M. Sung and G. S. Choi, "DeepRide: Dashcam Video Description Dataset for Autonomous Vehicle Location-Aware Trip Description," in IEEE Access, vol. 10, pp. 107361-107375, 2022, doi: 10.1109/ACCESS.2022.3212745.

@ARTICLE{9913431, author={Rafiq, Ghazala and Rafiq, Muhammad and On, Byung-Won and Sung, Mankyu and Choi, Gyu Sang}, journal={IEEE Access}, title={DeepRide: Dashcam Video Description Dataset for Autonomous Vehicle Location-Aware Trip Description}, year={2022}, volume={10}, number={}, pages={107361-107375}, doi={10.1109/ACCESS.2022.3212745}}

Link to article

https://ieeexplore.ieee.org/document/9913431

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepRide: Dashcam Video Description Dataset for Autonomous Vehicle Location-Aware Trip Description

Citation :

Link to article

Link to dataset

About

Releases

Packages

License

sharpian/deepride

Folders and files

Latest commit

History

Repository files navigation

DeepRide: Dashcam Video Description Dataset for Autonomous Vehicle Location-Aware Trip Description

Citation :

Link to article

Link to dataset

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages