diff --git a/docs/advanced.rst b/docs/advanced.rst index 031be6b1a..7a37ad030 100644 --- a/docs/advanced.rst +++ b/docs/advanced.rst @@ -118,15 +118,16 @@ OCR for huge images ------------------- Tesseract has internal limits on the size -of images it will process. If you issue -``--tesseract-downsample-large-images``, OCRmyPDF will downsample images -to fit Tesseract limits. (The limits are usually entered only for scanned -images of oversized media, such as large maps or blueprints exceeding -110 cm or 43 inches in either dimension, and at high DPI.) +of images it will process. By default, +``--tesseract-downsample-large-images`` is enabled, and OCRmyPDF will +downsample images to fit Tesseract limits. (The limits are usually encountered +only for scanned images of oversized media, such as large maps or blueprints exceeding +110 cm or 43 inches in either dimension, and at high DPI.) This feature can disabled +using ``--no-tesseract-downsample-large-images``. ``--tesseract-downsample-above Npixels`` adjusts the threshold at which images will be downsampled. By default, only images that exceed any of Tesseract's -internal limits are downsampled. +internal limits are downsampled (32767 pixels on either dimension). You will also need to set ``--tesseract-timeout`` high enough to allow for processing. diff --git a/docs/release_notes.rst b/docs/release_notes.rst index 1c471b211..29c7a6169 100644 --- a/docs/release_notes.rst +++ b/docs/release_notes.rst @@ -30,6 +30,25 @@ OCRmyPDF typically supports the three most recent Python versions. .. |OCRmyPDF PyPI| image:: https://img.shields.io/pypi/v/ocrmypdf.svg + +v16.2.0 +======= + +- Fixed issue 'NoneType' object has no attribute 'get' when optimizing certain PDFs. + :issue:`1293,1271` +- Switched formatting from black to ruff. +- Added support for sending sidecar output to io.BytesIO. +- Added support for converting HEIF/HEIC images (the native image of iPhones and + some other devices) to PDFs, when the appropriate pi-hief library is installed. + This library is marked as a dependency, but maintainers may opt out if needed. +- We now default to downsampling large images that would exceed Tesseract's internal + limits, but only if it cause processing to fail. Previously, this behavior only + occurred if specifically requested on command line. It can still be configured + and disabled. See the --tesseract command line options. +- Added Macports install instructions. Thanks @akierig. +- Improved logging output when an unexpected error occurs while trying to obtain + the version of a third party program. + v16.1.2 =======