Skip to content

A cross platform OCR Library based on PaddleOCR & OnnxRuntime & OpenVINO.

License

Notifications You must be signed in to change notification settings

MadMenHitBooker/RapidOCR

 
 

Repository files navigation

简体中文 | English

Open source OCR for the security of the digital world

Open in Colab
PyPI Documentation Status SemVer2.0

Contents

Introduction

  • Completely open source, free and support offline deployment of multi-platform and multi-language OCR.
  • Chinese Advertising: Welcome to join our QQ group to download the model and test program, QQ group number: 887298230
  • Cause: Baidu paddlepaddle engineering is not very good, in order to facilitate everyone to perform OCR reasoning on various terminals, we convert it to onnx format, use Python/C++/Java/Swift/C# to change It is ported to various platforms.
  • Name Source: Light, fast, economical and smart. OCR technology based on deep learning technology focuses on artificial intelligence advantages and small models, with speed as the mission and effect as the leading role.
  • Usage:
    • If the existing model in the repo meets the requirements → RapidOCR deployment can be used.
    • Not meeting requirements → Based on PaddleOCR. Fine-tune your own data → RapidOCR deployment. -If this repo is helpful to you, please click on a small star ⭐ Bah!

Recent News

  • 2023-08-27:
    • [Python] Integrate PaddleOCR v4 model and do review of v4 model. The v4-based rapidocr package has been updated to v1.3.0, for documentation see: link
    • Sorting out differences in the rapidocr version series and optimizing some of the python documentation.

Navigation

Overall Framework

flowchart LR
    subgraph Step
    direction TB
    C(Text Det) --> D(Text Cls) --> E(Text Rec)
    end

    A[/OurSelf Dataset/] --> B(PaddleOCR) --Train--> Step --> F(PaddleOCRModelConverter)
    F --ONNX--> G{RapidOCR Deploy\n<b>Python/C++/Java/C#</b>}
    G --> H(Windows x86/x64) & I(Linux) & J(Android) & K(Web) & L(Raspberry Pi)

    click B "https://github.com/PaddlePaddle/PaddleOCR" _blank
    click F "https://github.com/RapidAI/PaddleOCRModelConverter" _blank
Loading

Demo

  • Online demo
    • For details, please refer to: ocrweb/README
    • The model combination (optimal combination) used for the demo is:
      ch_PP-OCRv3_det + ch_ppocr_mobile_v2.0_cls + ch_PP-OCRv3_rec
      
    • Demo:
  • Hugging Face Demo
    • The demo is built on Hugging Face's Spaces, generated by the Gradio library.
    • Demo:

Changelog(more)

TODO and Task Claim

Original initiator and start-up author

Acknowledgements

  • Many thanks to DeliciaLaniD for fixing the misplaced start position of scan animation in ocrweb.
  • Many thanks to zhsunlight for the suggestion about parameterized call GPU reasoning and the careful and thoughtful testing.
  • Many thanks to lzh111222334 for fixing some bugs of rec preprocessing under python version.
  • Many thanks to AutumnSun1996 for the suggestion in the #42.
  • Many thanks to DeadWood8 for providing the document which packages rapidocr_web to exe by Nuitka.
  • Many thanks to Loovelj for fixing the bug of sorting the text boxes. For details see issue 75.

Sponsor

Sponsor Applied Products
-
  • If you want to sponsor the project, you can directly click the Sponsor button at the top of the current page, please write a note (e.g. your github account name) to facilitate adding to the sponsorship list above.

Authorization

  • The copyright of the OCR model belongs to Baidu, and the copyright of other engineering codes belongs to the owner of this warehouse.
  • This software is licensed under Apache 2.0. You are welcome to contribute code, submit an issue or even PR.
  • If you find this project useful in your research, please consider citing:
    @misc{RapidOCR 2021,
        title={{Rapid OCR}: OCR Toolbox},
        author={MindSpore Team},
        howpublished = {\url{https://github.com/RapidAI/RapidOCR}},
        year={2021}
    }

Join us

  • For international developers, we regard RapidOCR Disscussions as our international community platform. All ideas and questions can be discussed here in English.

Demo

Demonstration with C++/JVM

Demonstration with .Net

Demonstratioin with multi_language

About

A cross platform OCR Library based on PaddleOCR & OnnxRuntime & OpenVINO.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 85.2%
  • HTML 8.9%
  • Jupyter Notebook 2.9%
  • CSS 2.3%
  • C 0.7%