Methodology

Adds invisible text layers to PDFs for Overview

Methodology

This program always outputs 0.json and 0.blob.

The output 0.json has wantOcr:false.

We custom-built PdfOcr and we'll fix it if it has errors. The one error we can't handle is OutOfMemory. That will make us exit with a non-zero exit code, and we'll count on the framework to bail us out.

Testing

Write to test/test-*. docker build . will run the tests.

Each test has input.blob (which means the same as in production) and input.json (whose contents are $1 in do-convert-single-file). The files stdout, 0.json and 0.blob in the test directory are expected values. If actual values differ from expected values, the test fails.

PDF is a tricky format to get exactly right. You may need to use the Docker image itself to generate expected output files. For instance, here is how we build test-embedded-png/0.blob:

Wrote test/test-embedded-png/{input.json,input.blob,0.json,stdout}
Ran docker build .. The end of the output looked like this: Step 12/13 : RUN [ "/app/test-convert-single-file" ] ---> Running in 202f38be95c9 1..1 not ok 1 - test-embedded-png do-convert-single-file wrote /tmp/test-do-convert-single-file887786150/0.blob, but we expected it not to exist ...
docker cp 202f38be95c9:/tmp/test-do-convert-single-file887786150/0.blob test/test-embedded-png/
docker rm -f 202f38be95c9
Inspect the file to make sure it behaves as expected
docker build . again -- success!

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
java		java
test/test-embedded-png		test/test-embedded-png
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
VERSION		VERSION
do-convert-single-file		do-convert-single-file
release		release

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Methodology

Testing

About

Releases

Packages

Languages

overview/overview-convert-pdfocr

Folders and files

Latest commit

History

Repository files navigation

Methodology

Testing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages