UC3M-LP dataset is a comprehensive and open-source collection of annotated images for European (Spanish) license plate detection and recognition tasks. Researchers and developers are encouraged to use the UC3M-LP dataset to develop and evaluate algorithms for license plate detection, localization, character segmentation, and optical character recognition (OCR). The dataset also supports various research areas such as vehicle surveillance, automated toll systems, traffic analysis, and security applications. Check the open-access paper.
It is the largest open-source dataset for European license plate detection and recognition and the first one ever dedicated to Spanish license plates. It contains 1975 images from 2547 different vehicles with their corresponding license plate, comprising a total of 12757 plate characters.
The dataset is split into train, with 1580 images (80%), and test, with 395 images (20%). Each one of the 2547 license plates has been labeled with a two letter code. The first letter refers to the typology of Spanish license plates:
- Type A: 2498 samples of the most common long, one row with white background.
- Type B: 31 samples of motorcycle double row and white background.
- Type C: 1 sample of light motorcycle one row with yellow background.
- Type D: 11 samples of taxis and VTC (Spanish acronym for private hire vehicle) with blue background.
- Type E: 6 samples of trailer tows with black characters and red background.
and the second letter refers to the lighting conditions:
- Type D: 2185 samples at daytime.
- Type N: 362 samples at nighttime.
If you use this dataset in your research, please cite the following paper:
@article{RamajoBallester2024,
title = {Dual license plate recognition and visual features encoding for vehicle identification},
journal = {Robotics and Autonomous Systems},
volume = {172},
pages = {104608},
year = {2024},
issn = {0921-8890},
doi = {https://doi.org/10.1016/j.robot.2023.104608},
url = {https://www.sciencedirect.com/science/article/pii/S0921889023002476},
author = {Álvaro Ramajo-Ballester and José María {Armingol Moreno} and Arturo {de la Escalera Hueso}},
keywords = {Deep learning, Public dataset, ALPR, License plate recognition, Vehicle re-identification, Object detection},
}
The dataset is available for download.
The dataset is provided in a custom format. To transform it to YOLO format, please follow these instructions:
- Clone the repository:
git clone https://github.com/ramajoballester/UC3M-LP.git
cd UC3M-LP
- Install the required dependencies:
pip install -r requirements.txt
- Download and extract the dataset files and create this directory structure:
path/to/UC3M-LP/
└─── train
│ └─── 00001.jpg
│ └─── 00001.json
│ └─── 00002.jpg
│ └─── 00002.json
│ └─── ...
└─── test
│ └─── 00001.jpg
│ └─── 00001.json
│ └─── 00002.jpg
│ └─── 00002.json
│ └─── ...
└─── train.txt
└─── test.txt
- Run the script to transform the dataset to YOLO format. It will create 2 versions of the dataset, one for LP detection from the whole image and another one for LP recognition from the cropped LP region. The script will resize the images to the specified dimensions and save the resulting images and labels in in new directories. You can specify the desired dimensions for the images as arguments of the script.
python scripts/labels2yolo.py path/to/UC3M-LP 320 160
Run python scripts/labels2yolo.py -h
for more information.