Skip to content

Enables extraction of measurement data from binary files with extension 'raw' used by proprietary software imcFAMOS/imcSTUDIO and facilitates its storage in open source file formats

License

Notifications You must be signed in to change notification settings

RecordEvolution/IMCtermite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LICENSE STARS CI Build Wheel PYPI

IMCtermite

IMCtermite provides access to the proprietary data format IMC2 Data Format with the file extension .raw (or .dat) introduced and developed by imc Test & Measurement GmbH. This data format is employed i.a. by the measurement hardware imc CRONOSflex to dump and store data and the software packages imc Studio & imc FAMOS for measurement data control and analysis. Thanks to the integrated Python module, the extracted measurement data can be stored in any open-source file format accessible by Python like i.a. csv, json or parquet.

On the Record Evolution Platform, the library can be used both as a command line tool for interactive usage and as a Python module to integrate the .raw format into any ETL workflow.

Overview

File format

A file of the IMC2 Data Format type with extension .raw (or .dat) is a mixed text/binary file featuring a set of markers (keys) that indicate the start of various blocks of data that provide meta information and the actual measurement data. Every single marker is introduced by the character "|" = 0x 7c followed by two uppercase letters that characterize the type of marker. Each block is further divided into several parameters separated by commata "," = 0x 2c and terminated by a semicolon ";" = 0x 3b. For instance, the header - first 600 bytes - of a raw file may look like this (in UTF-8 encoding):

|CF,2,1,1;|CK,1,3,1,1;
|NO,1,86,0,78,imc STUDIO 5.0 R10 (04.08.2017)@imc DEVICES 2.9R7 (25.7.2017)@imcDev__15190567,0,;
|CG,1,5,1,1,1; |CD,2,  63,  5.0000000000000001E-03,1,1,s,0,0,0,  0.0000000000000000E+00,1;
|NT,1,16,1,1,1980,0,0,0.0;       |CC,1,3,1,1;|CP,1,16,1,4,7,32,0,0,1,0;
|CR,1,60,0,  1.0000000000000000E+00,  0.0000000000000000E+00,1,4,mbar;|CN,1,27,0,0,0,15,pressure_Vacuum,0,;
|Cb,1, 117,1,0,    1,         1,         0,      9608,         0,      9608,1,  2.0440300000000000E+03,  1.2416717060000000E+09,;
|CS,1,      9619,         1,�oD	�nD6�nD)�nD�

Line breaks are introduced for readability. Most of the markers introduce blocks of text, while only the last block identified by |CS contains binary data. The format supports the storage of multiple data sets (channels) in a single file. The channels may be ordered in multiplex mode (ordering w.r.t. time) or block mode (ordering w.r.t. to channels).

The markers (keys) are introduced by "|" = 0x 7c followed by two uppercase letters. There are two types of markers distinguished by the first letter:

  1. critical markers: introduced by |C featuring uppercase C
  2. noncritical markers: introduced by |N featuring uppercase N

The second letter represents further details of the specific key. Note that while the noncritical keys are optional, any .raw file cannot be correctly decoded if any of the critical markers are misinterpreted, invalid or damaged. The second uppercase letter is followed by the first comma and the version of the key starting from 1. After the next comma, an (long) integer (in text representation) specifies the length of the entire block, i.e. the number of bytes between the following comma and the block-terminating semicolon. The further structure of a block is not defined and may feature different numbers of additional parameters. The format allows for any number of carriage returns (CR = 0x0d) and line feeds (LF = 0x 0a) between keys, i.e. the block-terminating semicolon and the vertical bar (pipe) of the next key. The following critical markers are defined:

marker description
CF format version and processor
CK start of group of keys, no. parameters = 3, indicates (in)correct closure of the measurement series
CB defines a group of channels
CT text definition including group association index
CG introduces group of components corresponding to CC keys
CD1,2 old/new version of abscissa description
CZ scaling of z-axis for segments
CC start of a component
CP information about buffer, datatype and samples of component
Cb buffer description
CR permissible range of values in component
CN name and comment of channel
CS raw binary data
CI single numerical value (including unit)
Ca add reference key

Among the noncritical markers, there are

marker description
NO origin of data
NT timestamp of trigger
ND (color) display properties
NU user defined key
Np property of a channel
NE extraction rule for channels from BUS data

The format loosely defines some rules for the ordering of the markers in the file stream. The rules for critical keys include: CK has to follow up on CF, CK may be followed by any number of CG blocks, each CG has to be followed by (any number of) component sequences comprised of the series CC , CP, (CR), (ND) and terminated by either CS or the start of a new group, component, text field or buffer.

Installation

The IMCtermite library may be employed both as a CLI tool and a python module.

CLI tool

To build the CLI tool locally, use the default target make resulting in the binary imctermite. To ensure system-wide availability, the installation of the tool (in the default location /usr/local/bin) is done via

make install

which may require root permissions.

Python

To integrate the library into a customized ETL toolchain, several cython targets are available. For a local build that enables you to run the examples, use:

make cython-build

However, in a production environment, a proper installation of the module with make cython-install is recommended for system-wide availability of the module.

Installation with pip

The package is also available in the Python Package Index at IMCtermite. To install the latest version simply do

python3 -m pip install IMCtermite

which provides binary wheels for multiple architectures on Windows and Linux and most Python 3.x distributions. However, if your platform/architecture is not supported you can still compile the source distribution yourself, which requires python3_setuptools and an up-to-date compiler supporting C++11 standard (e.g. gcc version >= 10.2.0).

Usage

CLI

The usage of the imctermite binary looks like this:

imctermite <raw-file> [options]

You have to provide a single raw file and any option to specify what to do with the data. All available options can be listed with imctermite --help:

Options:

 -c, --listchannels      list channels
 -b, --listblocks        list IMC key-blocks
 -d, --output            output directory to print channels
 -s, --delimiter         csv delimiter/separator char for output
 -h, --help              show this help message
 -v, --version           display version

For instance, to show a list of all channels included in sample-data.raw, you do imctermite sample-data.raw --listchannels. No output files are written by default. Output files are written only when an existing (!) directory is provided as argument to the --output option. By default, every output file is written using a , delimiter. You may provide any custom separator with the option --delimiter. For example, in order to use |, the binary is called with options imctermite sample-data.raw -b -c -s '|'.

Python

Given the IMCtermite module is available, we can import it and declare an instance of it by passing a raw file to the constructor:

import IMCtermite

imcraw = IMCtermite.imctermite(b"sample/sampleA.raw")

An example of how to create an instance and obtain the list of channels is:

import IMCtermite

# declare and initialize instance of "imctermite" by passing a raw-file
try :
    imcraw = IMCtermite.imctermite(b"samples/sampleA.raw")
except RuntimeError as e :
    print("failed to load/parse raw-file: " + str(e))

# obtain list of channels as list of dictionaries (without data)
channels = imcraw.get_channels(False)
print(channels)

A more complete example, including the methods for obtaining the channels, i.a. their data and/or directly printing them to files, can be found in the python/examples folder.

References

About

Enables extraction of measurement data from binary files with extension 'raw' used by proprietary software imcFAMOS/imcSTUDIO and facilitates its storage in open source file formats

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published