-
Hi,
I would like to suggest as an improvement, to expose the objects as Python objects, using a class for each object type and allowing indexing on non-primitives to access nested valuse. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
Thanks for submitting this! There are a lot of analogies between Python's object model and that what is called an "object" in PDF. So your idea does have tempting aspects. There are however good reasons why we don't want to do this:
There still exist a few, selected options for the experts who know what they are doing: |
Beta Was this translation helpful? Give feedback.
Thanks for submitting this!
There are a lot of analogies between Python's object model and that what is called an "object" in PDF. So your idea does have tempting aspects.
There are however good reasons why we don't want to do this:
MuPDF supports a range of different document types. If you count them and include images, you will end up with more than a dozen. This plethora of types is not set in stone, support for the MOBI e-book format for example has been added in version 1.21.0.
PDF is just one type among many others. The overall strategy is to abstract from the differences between these document types and to keep a large set of common, universally applicable code. Text extraction a…