Skip to content

Expose PDF objects as Python objects? #2022

Answered by JorjMcKie
yossizahn asked this question in Q&A
Discussion options

You must be logged in to vote

Thanks for submitting this!

There are a lot of analogies between Python's object model and that what is called an "object" in PDF. So your idea does have tempting aspects.

There are however good reasons why we don't want to do this:

  1. MuPDF supports a range of different document types. If you count them and include images, you will end up with more than a dozen. This plethora of types is not set in stone, support for the MOBI e-book format for example has been added in version 1.21.0.
    PDF is just one type among many others. The overall strategy is to abstract from the differences between these document types and to keep a large set of common, universally applicable code. Text extraction a…

Replies: 1 comment 7 replies

Comment options

You must be logged in to vote
7 replies
@JorjMcKie
Comment options

@JorjMcKie
Comment options

@JorjMcKie
Comment options

@JorjMcKie
Comment options

@yossizahn
Comment options

Answer selected by JorjMcKie
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants