PaddleVideo v2.0.0
Release Note
PaddleVideo 基于2.0动态图实现,使用模块化设计,将各部分功能拆分到不同组件中进行解耦。可以轻松的组合、配置和自定义组件来快速实现视频算法模型。
基础能力
- 支持更多的数据集和模型结构,包括: Kinectics400、UCF-101、YoutTube8M、ActivityNet等数据集。
- 发布多个视频分类和视频动作定位方向模型,包括: TSN、TSM、SlowFast、AttentionLSTM、BMN模型。
- 打通完整部署全流程。
亮点建设
- 发布2D SOTA模型ppTSM: 在Kinectics-400数据集上Top1精度为73.5% ,较标准版TSM提升3.5%,且模型参数量持平,模型训练和预测速度更快。
- 发布多种训练加速方案:SlowFast训练速度相较于原始实现提速100%,TSN+DALI训练速度相较于原始实现提速3.6倍 。
特色应用
- 发布大规模视频分类模型VideoTag: 使用千万量级数据集训练的视频标签预训练模型,支持3000个源于产业实践的实用标签。
- 发布足球动作检测算法FootballAction: 高效定位出视频中各种足球动作发生的起止时间以及该动作类别。
Release Note
Support dynamic graph programming paradigm, adapted to Paddle2.0. Including:
- Various dataset. PaddleVideo supports various datasets including Kinectics400, ucf101, YoutTube8M datasets.
- Various architectures. PaddleVideo supports more architectures, including video recognition models, such as TSN, TSM, SlowFast, AttentionLSTM and action localization model, like BMN.
- Deployable. PaddleVideo is powered by the Paddle Inference.
- Higher performance. PP-TSM, which is based on the standard TSM, already archive the best performance in the 2D recognition network, has the same size of parameters but improve the Top1 Acc to 73.5%.
- Faster training strategy. PaddleVideo supports faster training strategy, it accelerates by 100% compared with the standard Slowfast version. TSN+DALI speed up training 3.6x.
- VideoTag. 3k Large-Scale video classification model.
- FootballAction. Football action detection model.