Skip to content

Commit

Permalink
v16.2.0 release notes
Browse files Browse the repository at this point in the history
  • Loading branch information
jbarlow83 committed Apr 16, 2024
1 parent 9ba4e3a commit 0e013df
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 6 deletions.
13 changes: 7 additions & 6 deletions docs/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,15 +118,16 @@ OCR for huge images
-------------------

Tesseract has internal limits on the size
of images it will process. If you issue
``--tesseract-downsample-large-images``, OCRmyPDF will downsample images
to fit Tesseract limits. (The limits are usually entered only for scanned
images of oversized media, such as large maps or blueprints exceeding
110 cm or 43 inches in either dimension, and at high DPI.)
of images it will process. By default,
``--tesseract-downsample-large-images`` is enabled, and OCRmyPDF will
downsample images to fit Tesseract limits. (The limits are usually encountered
only for scanned images of oversized media, such as large maps or blueprints exceeding
110 cm or 43 inches in either dimension, and at high DPI.) This feature can disabled
using ``--no-tesseract-downsample-large-images``.

``--tesseract-downsample-above Npixels`` adjusts the threshold at which images
will be downsampled. By default, only images that exceed any of Tesseract's
internal limits are downsampled.
internal limits are downsampled (32767 pixels on either dimension).

You will also need to set ``--tesseract-timeout`` high enough to allow
for processing.
Expand Down
19 changes: 19 additions & 0 deletions docs/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,25 @@ OCRmyPDF typically supports the three most recent Python versions.

.. |OCRmyPDF PyPI| image:: https://img.shields.io/pypi/v/ocrmypdf.svg


v16.2.0
=======

- Fixed issue 'NoneType' object has no attribute 'get' when optimizing certain PDFs.
:issue:`1293,1271`
- Switched formatting from black to ruff.
- Added support for sending sidecar output to io.BytesIO.
- Added support for converting HEIF/HEIC images (the native image of iPhones and
some other devices) to PDFs, when the appropriate pi-hief library is installed.
This library is marked as a dependency, but maintainers may opt out if needed.
- We now default to downsampling large images that would exceed Tesseract's internal
limits, but only if it cause processing to fail. Previously, this behavior only
occurred if specifically requested on command line. It can still be configured
and disabled. See the --tesseract command line options.
- Added Macports install instructions. Thanks @akierig.
- Improved logging output when an unexpected error occurs while trying to obtain
the version of a third party program.

v16.1.2
=======

Expand Down

0 comments on commit 0e013df

Please sign in to comment.