Skip to content

Commit

Permalink
first draft layout
Browse files Browse the repository at this point in the history
  • Loading branch information
mmaelicke committed Aug 20, 2024
1 parent bb09037 commit 57072fa
Show file tree
Hide file tree
Showing 6 changed files with 50 additions and 201 deletions.
26 changes: 13 additions & 13 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,13 @@
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Tool Spec Python template
title: Metacatalog GeoCube aggregator
message: >-
Please replace this citation information with appropriate
metadata for your tool
This tool is designed to be used together with the V-FOR-WaTer Metacatalog data loader.
It uses a number of data source files along with either a metacatalog entry or JSON dumps
of the metadata.
The data is aggregated to a target precision (temporal) and spatial resolution and then
ingested into a geocube that is stored as a netCDF file.
type: software
authors:
- given-names: Mirko
Expand All @@ -15,14 +18,7 @@ authors:
Institute for Water and Environment, Hydrology,
Karlsruhe Institute for Technology (KIT)
orcid: 'https://orcid.org/0000-0002-0424-2651'
- given-names: Alexander
family-names: Dolich
email: [email protected]
affiliation: >-
nstitute for Water and Environment, Hydrology,
Karlsruhe Institute for Technology (KIT)
orcid: 'https://orcid.org/0000-0003-4160-6765'
repository-code: 'https://github.com/VForWaTer/tool_template_python'
repository-code: 'https://github.com/hydrocode-de/metacatalog_aggregator'
url: 'https://vforwater.github.io/tool-specs/'
abstract: >-
This is a Github repository template for scientific data
Expand All @@ -34,6 +30,10 @@ keywords:
- docker
- tool-spec
- V-For-WaTer
- MetaCatalog
- netCDF
- DataCube
- open data cube
license: CC-BY-4.0
version: '0.5'
date-released: '2024-07-30'
version: '0.1'
date-released: '2024-08-20'
20 changes: 10 additions & 10 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Pull any base image that includes python3
FROM python:3.12
FROM python:3.12.2

# install the toolbox runner tools
RUN pip install "json2args[data]>=0.6.2"

# if you do not need data-preloading as your tool does that on its own
# you can use this instread of the line above to use a json2args version
# with less dependencies
# RUN pip install json2args>=0.6.2

# Do anything you need to install tool dependencies here
RUN echo "Replace this line with a tool"
RUN pip install "json2args>=0.6.2" \
metacatalog==0.9.2 \
ipython==8.26.0 \
pandas==2.2.2 \
geopandas==1.0.1 \
xarray[complete]==2024.7.0 \
rioxarray==0.17.0 \
polars-lts-cpu==1.1.0 \
geocube==0.6.0

# create the tool input structure
RUN mkdir /in
Expand Down
121 changes: 0 additions & 121 deletions LICENSE

This file was deleted.

64 changes: 14 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,86 +1,50 @@
# tool_template_python
# Metacatalog aggregator

[![Docker Image CI](https://github.com/VForWaTer/tool_template_python/actions/workflows/docker-image.yml/badge.svg)](https://github.com/VForWaTer/tool_template_python/actions/workflows/docker-image.yml)
[![DOI](https://zenodo.org/badge/558416591.svg)](https://zenodo.org/badge/latestdoi/558416591)

This is the template for a generic containerized Python tool following the [Tool Specification](https://vforwater.github.io/tool-specs/) for reusable research software using Docker.
This tool is designed to be used together with the V-FOR-WaTer [Metacatalog data loader](https://github.com/VForWaTer/tool_vforwater_loader).
It uses a number of data source files along with either a metacatalog entry or JSON dumps of the metadata. The data is aggregated to a target precision (temporal) and spatial resolution and then ingested into a geocube that is stored as a netCDF file.

This template can be used to generate new Github repositories from it.
This tool is based on the [Python template](https://github.com/vforwater/tool_template_python) for a generic containerized Python tool following the [Tool Specification](https://vforwater.github.io/tool-specs/) for reusable research software using Docker.


## How generic?

Tools using this template can be run by the [toolbox-runner](https://github.com/hydrocode-de/tool-runner).
That is only convenience, the tools implemented using this template are independent of any framework.

The main idea is to implement a common file structure inside container to load inputs and outputs of the
tool. The template shares this structures with the [R template](https://github.com/vforwater/tool_template_r),
[NodeJS template](https://github.com/vforwater/tool_template_node) and [Octave template](https://github.com/vforwater/tool_template_octave),
but can be mimiced in any container.

Each container needs at least the following structure:
## Structure

```
/
|- in/
| |- parameters.json
| |- input.json
|- out/
| |- ...
|- src/
| |- tool.yml
| |- run.py
| |- CITATON.cff
```

* `parameters.json` are parameters. Whichever framework runs the container, this is how parameters are passed.
* `input.json` are parameters. Whichever framework runs the container, this is how parameters are passed.
* `tool.yml` is the tool specification. It contains metadata about the scope of the tool, the number of endpoints (functions) and their parameters
* `run.py` is the tool itself, or a Python script that handles the execution. It has to capture all outputs and either `print` them to console or create files in `/out`
* `run.py` is the tool itself
* `CITATION.cff` is a citation file that describes the tool and its authors. It is used by the

## How to build the image?

You can build the image from within the root of this repo by
```
docker build -t tbr_python_tempate .
docker build -t metacatalog_geocube .
```

Use any tag you like. If you want to run and manage the container with [toolbox-runner](https://github.com/hydrocode-de/tool-runner)
they should be prefixed by `tbr_` to be recognized.

Alternatively, the contained `.github/workflows/docker-image.yml` will build the image for you
on new releases on Github. You need to change the target repository in the aforementioned yaml.

## How to run?

This template installs the json2args python package to parse the parameters in the `/in/parameters.json`. This assumes that
the files are not renamed and not moved and there is actually only one tool in the container. For any other case, the environment variables
`PARAM_FILE` can be used to specify a new location for the `parameters.json` and `TOOL_RUN` can be used to specify the tool to be executed.
This template installs the json2args python package to parse the parameters in the `/in/input.json`. This assumes that
the files are not renamed and not moved and there is actually only one tool in the container. For any other case, the environment variables `PARAM_FILE` can be used to specify a new location for the `parameters.json` and `TOOL_RUN` can be used to specify the tool to be executed.
The `run.py` has to take care of that.

To invoke the docker container directly run something similar to:
```
docker run --rm -it -v /path/to/local/in:/in -v /path/to/local/out:/out -e TOOL_RUN=foobar tbr_python_template
docker run --rm -it -v /path/to/local/in:/in -v /path/to/local/out:/out -e TOOL_RUN=geocube metacatalog_geocube
```

Then, the output will be in your local out and based on your local input folder. Stdout and Stderr are also connected to the host.

With the [toolbox runner](https://github.com/hydrocode-de/tool-runner), this is simplyfied:

```python
from toolbox_runner import list_tools
tools = list_tools() # dict with tool names as keys

foobar = tools.get('foobar') # it has to be present there...
foobar.run(result_path='./', foo_int=1337, foo_string="Please change me")
```
The example above will create a temporary file structure to be mounted into the container and then create a `.tar.gz` on termination of all
inputs, outputs, specifications and some metadata, including the image sha256 used to create the output in the current working directory.

## What about real tools, no foobar?

Yeah.

1. change the `tool.yml` to describe your actual tool
2. add any `pip install` or `apt-get install` needed to the dockerfile
3. add additional source code to `/src`
4. change the `run.py` to consume parameters and data from `/in` and useful output in `out`
5. build, run, rock!

8 changes: 5 additions & 3 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# tool_template_python
# Metacatalog aggregator

This is the template for a generic containerized Python tool following the [Tool Specification](https://vforwater.github.io/tool-specs/) for reusable research software using Docker.

This template can be used to generate new Github repositories from it.
This tool is designed to be used together with the V-FOR-WaTer [Metacatalog data loader](https://github.com/VForWaTer/tool_vforwater_loader).
It uses a number of data source files along with either a metacatalog entry or JSON dumps of the metadata. The data is aggregated to a target precision (temporal) and spatial resolution and then ingested into a geocube that is stored as a netCDF file.

This tool is based on the [Python template](https://github.com/vforwater/tool_template_python) for a generic containerized Python tool following the [Tool Specification](https://vforwater.github.io/tool-specs/) for reusable research software using Docker.
12 changes: 8 additions & 4 deletions src/tool.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
tools:
foobar:
title: Foo Bar
description: A dummy tool to exemplify the YAML file
version: 0.1
geocube:
title: Metacatalog GeoCube
description: |
This tool is designed to be used together with the V-FOR-WaTer Metacatalog data loader.
It uses a number of data source files along with either a metacatalog entry or JSON dumps
of the metadata.
The data is aggregated to a target precision (temporal) and spatial resolution and then
ingested into a geocube that is stored as a netCDF file.
parameters:
foo_int:
type: integer
Expand Down

0 comments on commit 57072fa

Please sign in to comment.