Use page.get_text("json", flags=2) to extract the text, and the bbox in the extraction result has a negative number #1104
-
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
I want to get the right height&weight of pdf page(have the page margin, like size of conver2pdf's page), and get the positive bbox values |
Beta Was this translation helpful? Give feedback.
-
Negative coordinates are not a bug necessarily. They may happen and if so, it was the PDF creator who is responsible. |
Beta Was this translation helpful? Give feedback.
Negative coordinates are not a bug necessarily. They may happen and if so, it was the PDF creator who is responsible.
By the way: your width and height are both positive, so this is not the problem.
If you want to see only those which has positive coordinate, specify a rectangle when extracting text:
page.get_text(..., clip=rect)
.