This implementation maps the RDF output of ExifTool into CASE.
Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.
To install this software, clone this repository, and run python3 setup.py install
from within this directory. (You might want to do this in a virtual environment.)
The tests build several examples of mapped ExifTool output. For instance, a sample analysis of a govdocs1 file is here. The tests directory demonstrates the expected usage patterns of ExifTool that this project analyzes.
Of note, ExifTool is expected to be run twice, to take advantage of some of the mapped information. Both run types use these flags:
-binary
-duplicates
-xmlFormat
Then, one run, the "Raw" run, is expected to use this flag:
--printConv
(Note the double dash)
You can see the expected command line forms in this Makefile, observing the recipes for 799987_printConv.xml
and 799987_raw.xml
. Note that ExifTool by default does not put the top @rdf:about
into an IRI form, which causes some RDF processing libraries to reject ingesting the output. Hence, one necessary postprocessing step is to translate the @rdf:about
attribute into an IRI. (This was done for the tests directory with a sed
command.)
This repository follows CASE community guidance on describing development status, by adherence to noted support requirements.
The status of this repository is:
4 - Beta
A future feature of this repository will be a listing of this code base's coverage of ExifTool's RDF namespaces. For now, the set of predicates found from analyzing the JPEG files of the govdocs1 corpus is listed here.
Any concepts from ExifTool's RDF output that are not mapped in this code base are preserved by attaching to the CASE output. See e.g. all of the namespaced concepts starting with the prefix exiftool-
in this example output file.
This project follows SEMVER 2.0.0 where versions are declared.
This repository supports the CASE and UCO ontology versions that are distributed with the CASE-Utilities-Python repository, at the newest version below a ceiling-pin in setup.cfg. Currently, those ontology versions are:
- CASE 1.3.0
- UCO 1.3.0
This repository is available at the following locations:
- https://github.com/casework/CASE-Implementation-ExifTool
- https://github.com/usnistgov/CASE-Implementation-ExifTool (a mirror)
Releases and issue tracking will be handled at the casework location.
Some make
targets are defined for this repository:
all
- No effect.check
- Run unit tests. NOTE: The tests entail downloading some software to assist with formatting and conversion, from PyPI and from a third party.make download
retrieves these files.clean
- Remove test build files, but not downloaded files or thetests/venv
virtual environment.distclean
- Runmake clean
and further delete downloaded files and thetests/venv
virtual environment. Neitherclean
nordistclean
will remove downloaded submodules.download
- Download files sufficiently to run the unit tests offline. This will not include the ontology repositories tracked as submodules. Note if you do need to work offline, be aware touching thesetup.py
file in the root, ortests/requirements.txt
, will trigger a virtual environment rebuild.
Note that downloading known sample binary data (JPEG files) is not yet done for unit testing, but might be done in the future.
This repository is tested in several POSIX environments. See the dependencies/ directory for package-installation and -configuration scripts for some of the test environments.
Note that running tests in FreeBSD requires running gmake
, not make
.