简体中文 | English

Open source OCR for the security of the digital world

Contents

Introduction
Recent News
Navigation
Overall Framework
Demo
Changelog(more)
TODO and Task Claim
Original initiator and start-up author
Acknowledgements
Sponsor
Authorization
Join us
Demo

Introduction

Completely open source, free and support offline deployment of multi-platform and multi-language OCR.
Chinese Advertising: Welcome to join our QQ group to download the model and test program, QQ group number: 887298230
Cause: Baidu paddlepaddle engineering is not very good, in order to facilitate everyone to perform OCR reasoning on various terminals, we convert it to onnx format, use Python/C++/Java/Swift/C# to change It is ported to various platforms.
Name Source: Light, fast, economical and smart. OCR technology based on deep learning technology focuses on artificial intelligence advantages and small models, with speed as the mission and effect as the leading role.
Usage:
- If the existing model in the repo meets the requirements → RapidOCR deployment can be used.
- Not meeting requirements → Based on PaddleOCR. Fine-tune your own data → RapidOCR deployment. -If this repo is helpful to you, please click on a small star ⭐ Bah!

Recent News

2023-08-27:
- [Python] Integrate PaddleOCR v4 model and do review of v4 model. The v4-based rapidocr package has been updated to v1.3.0, for documentation see: link
- Sorting out differences in the rapidocr version series and optimizing some of the python documentation.

Navigation

Wiki
Python demo
C++ demo(Windows/Linux/macOS)
- RapidOcrOnnx
- RapidOcrNcnn
Jvm demo(Java/Kotlin)
- RapidOcrOnnxJvm
- RapidOcrNcnnJvm
.Net demo(C#)
Android demo
API
Web demo:
RapidStructure
FAQ
Derivatives
Related projects
- RapidOCRPDF：extract PDF content.
- RapidVideOCR: Extract hard subtitles in videos based on RapidOCR
- LGPMA_Infer: table structure restoration | blog interpretation papers and source code
- Document Unwarping-PaperEdge | Demo
- Text Removal-CTRNet | Demo

Overall Framework

flowchart LR
    subgraph Step
    direction TB
    C(Text Det) --> D(Text Cls) --> E(Text Rec)
    end

    A[/OurSelf Dataset/] --> B(PaddleOCR) --Train--> Step --> F(PaddleOCRModelConverter)
    F --ONNX--> G{RapidOCR Deploy\n<b>Python/C++/Java/C#</b>}
    G --> H(Windows x86/x64) & I(Linux) & J(Android) & K(Web) & L(Raspberry Pi)

    click B "https://github.com/PaddlePaddle/PaddleOCR" _blank
    click F "https://github.com/RapidAI/PaddleOCRModelConverter" _blank

Loading

Demo

Online demo
- For details, please refer to: ocrweb/README
- The model combination (optimal combination) used for the demo is:
```
ch_PP-OCRv3_det + ch_ppocr_mobile_v2.0_cls + ch_PP-OCRv3_rec
```
- Demo:
Hugging Face Demo
- The demo is built on Hugging Face's Spaces, generated by the Gradio library.
- Demo:

Changelog(more)

rapidocr
rapidocr_web
rapidocr_api

TODO and Task Claim

See here: link

Original initiator and start-up author

Acknowledgements

Many thanks to DeliciaLaniD for fixing the misplaced start position of scan animation in ocrweb.
Many thanks to zhsunlight for the suggestion about parameterized call GPU reasoning and the careful and thoughtful testing.
Many thanks to lzh111222334 for fixing some bugs of rec preprocessing under python version.
Many thanks to AutumnSun1996 for the suggestion in the #42.
Many thanks to DeadWood8 for providing the document which packages rapidocr_web to exe by Nuitka.
Many thanks to Loovelj for fixing the bug of sorting the text boxes. For details see issue 75.

Sponsor

Sponsor	Applied Products

	-

If you want to sponsor the project, you can directly click the Sponsor button at the top of the current page, please write a note (e.g. your github account name) to facilitate adding to the sponsorship list above.

Authorization

The copyright of the OCR model belongs to Baidu, and the copyright of other engineering codes belongs to the owner of this warehouse.
This software is licensed under Apache 2.0. You are welcome to contribute code, submit an issue or even PR.

If you find this project useful in your research, please consider citing:

@misc{RapidOCR 2021,
    title={{Rapid OCR}: OCR Toolbox},
    author={MindSpore Team},
    howpublished = {\url{https://github.com/RapidAI/RapidOCR}},
    year={2021}
}

Join us

For international developers, we regard RapidOCR Disscussions as our international community platform. All ideas and questions can be discussed here in English.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Introduction

Recent News

Navigation

Overall Framework

Demo

Changelog(more)

TODO and Task Claim

Original initiator and start-up author

Acknowledgements

Sponsor

Authorization

Join us

Demo

Demonstration with C++/JVM

Demonstration with .Net

Demonstratioin with multi_language

Files

README.md

Latest commit

History

README.md

File metadata and controls

Introduction

Recent News

Navigation

Overall Framework

Demo

Changelog(more)

TODO and Task Claim

Original initiator and start-up author

Acknowledgements

Sponsor

Authorization

Join us

Demo

Demonstration with C++/JVM

Demonstration with .Net

Demonstratioin with multi_language