Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After rotation, text is not extracted #1

Open
eikek opened this issue Aug 16, 2022 · 0 comments
Open

After rotation, text is not extracted #1

eikek opened this issue Aug 16, 2022 · 0 comments

Comments

@eikek
Copy link
Member

eikek commented Aug 16, 2022

Original: eikek/docspell#554 (comment)

I did few more tests and here are the results:

  1. First I uploaded original document (jpg file with incorrect rotation) - OCR couldn't recognize the text properly
  2. I used the rotate addon - the converted PDF got rotated, but the extracted text didn't change
  3. Also, I made a copy of the jpg file, rotated it with Windows Photos app (then, to ensure, I checked with paint - it was rotated properly) and uploaded. The result was having the "original" jpg file rotated properly, but the converted PDF is rotated incorrectly (as it was originally in point 1).
  4. However, when I took the properly-rotated jpg file from point 3, opened in paint, added a single dot anywhere and uploaded - then both the original file and converted PDF were rotated properly (rotation wasn't changed as it happened in point 3) and OCR properly recognized the text.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant