This project using Pytorch - An open source machine learning framework
Download the project
git clone https://github.com/ishin-pie/e2e-scene-text-spotting.git
Installing from requirements.txt file
cd e2e-scene-text-spotting
pip install -r requirements.txt
Note: we suggest you to install on the python virtual environment
Learn more: Installing Deep Learning Frameworks on Ubuntu with CUDA support
Running the demo of our pre-trained model
python demo.py -m=model_best.pth.tar
During training, we use ICDAR 2015 Traning Set and ICDAR 2017 Training Set (Latin only). In addition, we use ICDAR 2015 Test Set for validating our model.
The dataset should structure as follows:
[dataset root directory]
├── train_images
│ ├── img_1.jpg
│ ├── img_2.jpg
│ └── ...
├── train_gts
│ ├── gt_img_1.txt
│ ├── gt_img_2.txt
│ └── ...
└── test_images
├── img_1.jpg
├── img_2.jpg
└── ...
Note: the [dataset root directory] should be placed in "config.json" file.
Sample of ground truth format:
x1,y1,x2,y2,x3,y3,x4,y4,script,transcription
Training the model by yourself
python train.py
Note: check the "config.json" file, which is used to adjust the training configuration.
Experiment on GEFORCE RTX 2070