-
Notifications
You must be signed in to change notification settings - Fork 557
Home
Welcome to the PyMuPDF wiki!
PyMuPDF (formerly known as python-fitz) is a Python binding for MuPDF - "a lightweight PDF and XPS viewer".
MuPDF can access files in PDF, XPS, OpenXPS, CBZ (comic book) and EPUB (e-book) formats.
These are files with extensions *.pdf
, *.xps
, *.oxps
, *.cbz
or *.epub
(so in essence, with this binding you can develop e-book viewers in Python ...)
PyMuPDF provides access to all important functions of MuPDF from within a Python environment. Nevertheless, we are continuously expanding this function set.
MuPDF stands out among all similar products for its top rendering capability and unsurpassed processing speed.
Check this out yourself:
Compare the various free PDF-viewers. In terms of speed and rendering quality SumatraPDF ranges at the top (apart from MuPDF's own standalone viewer) - since the time it has been changed to use the MuPDF library!
Also do have a look at the performance chapter of the documentation. There you will find several test results that underpin these statements.
While PyMuPDF has been available since several years for an earlier version of MuPDF (1.2), it was until only mid May 2015, that its creator and a few co-workers decided to elevate it to support the current release of MuPDF - which then was 1.7.
Since then we have continuously upgraded PyMuPDF to keep it in sync with MuPDF's ongoing development. MuPDF seems to roughly follow a semi-annual schedule with new releases appearing in April and late November each year.
So far, we have been able to quickly follow up with a respective new PyMuPDF version (and potentially subsequent subversions) each time: in November 2015, version 1.8.0 was published, version 1.9.1 followed in April 2016, etc.
PyMuPDF has been tested and runs on Mac, Linux, Windows (XP through 10), Python 2.7 thru Python 3.7 (x86 and x64 versions). Other Python platforms should work too as long as MuPDF supports them.
We invite you to join our efforts by contributing to the the wiki pages, by using what is there - and, of course, by submitting issues and bugs to the site!
The Python import statement for this library is import fitz
. Here is the reason why:
The original rendering library for MuPDF was called Libart
. "After Artifex Software acquired the MuPDF project, the development focus shifted on writing a new modern graphics library called Fitz
. Fitz was originally intended as an R&D project to replace the aging Ghostscript graphics library, but has instead become the rendering engine powering MuPDF." (Quoted from Wikipedia).
HOWTO Button annots with JavaScript
HOWTO work with PDF embedded files
HOWTO extract text from inside rectangles
HOWTO extract text in natural reading order
HOWTO create or extract graphics
HOWTO create your own PDF Drawing
Rectangle inclusion & intersection
Metadata & bookmark maintenance