-
Describe the bug (mandatory)I get text of this test pdf, but some text show �. How to resolve it? Thanks in advance.
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
A typical "Discussions" post. Let me convert this first. |
Beta Was this translation helpful? Give feedback.
-
As a background read I recommend this article on Artifex' blogging page. A PDF creator may choose fonts that contain no information about how to back-translate the visual appearance of characters to their originating Unicode value. |
Beta Was this translation helpful? Give feedback.
As a background read I recommend this article on Artifex' blogging page.
A PDF creator may choose fonts that contain no information about how to back-translate the visual appearance of characters to their originating Unicode value.
This so-called CMAP (Character Map) may be missing - by error or on purpose.
The only way out (see the article) is OCRing the page, or parts of it - as it seems advisable in your case. Take a look at this demo script.