UCD Library Media Utils

This is a collection of scripts and utilities for working with media files in the UCD Library. Currently the following applications are bundled:

tesseract: OCR (Optical Character Recognition) for PDFs/Images
- CLI docs: https://tesseract-ocr.github.io/tessdoc/Command-Line-Usage.html
ffmpeg: Video/Audio transcoding
- CLI docs: https://ffmpeg.org/ffmpeg.html
imagemagick: Image manipulation
- Convert CLI docus: https://imagemagick.org/script/convert.php

Local Usage

Assuming you have a local pdf file named test.pdf in your current directory, you can run the following command to OCR the file:

docker run \
  -w /data \
  --rm \
  -v $(pwd):/data \
  gcr.io/ucdlib-pubreg/media-utils:latest \
  tesseract test.pdf test --dpi 300 -l eng --psm 1 --oem 3 hocr

Here is another example resizing an image:

docker run \
  -w data \
  --rm \
  -v $(pwd):/data \
  gcr.io/ucdlib-pubreg/media-utils:latest \
  convert test.png -resize 50% test_50.png

Examples

Checkout the dams-examples.md file for some examples of how this container is used with the DAMS.

Server

TODO

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
install		install
server		server
.cork-build		.cork-build
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
dams-examples.md		dams-examples.md
im-policy.xml		im-policy.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UCD Library Media Utils

Local Usage

Examples

Server

About

Releases

Packages

Languages

License

ucd-library/ucdlib-media-utils

Folders and files

Latest commit

History

Repository files navigation

UCD Library Media Utils

Local Usage

Examples

Server

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages