Exact match using PuMuPDF #1277
Replies: 2 comments 13 replies
-
Within a text block you could check whether the exact sequence of the word(s) of your search string is equal to a sub-sequence of the block text split into words ... Just an idea. Or extract the text as "words", sort them by block-number, line-number, vertical and horizontal coordinates. Then look wether your |
Beta Was this translation helpful? Give feedback.
-
Combining arbitrary quads may not make any sense, because
In both above cases it would be hard to even uniquely define what the joining quad should be. You can however easily build the envelopping rectangle of a quad ( Also please read the docu for how to highlight text which potentially is spread across multiple lines and you know the top-left of the begin and bottom-right of the end.
Where is the problem? If one of the rectangles |
Beta Was this translation helpful? Give feedback.
-
I'm working on an app in which we need to verify if a text is an exact match (eg: "city" doesn't match "electricity").
I've looked at few existing questions( #678 & Stack-Overflow) but couldn't find an accurate solution.
I found a way that is the most accurate out of all the solutions available online (below).
But I'm facing an issue when trying to verify text in multiple lines
Here is my code:
This works perfectly fine, except in multi-line searches - I think its because the
clip
inpage.get_text
isn't spanning across multiple lineIs there any way to make it more accurate?
Beta Was this translation helpful? Give feedback.
All reactions