-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add a readme file for parameters mapping workflow
- Loading branch information
1 parent
1b493df
commit 2181b8b
Showing
1 changed file
with
37 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
# Parameters Mapping or how to add a metadata header for CSV files | ||
|
||
1. check that the parameters to add are listed in [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) | ||
* ```parameters.csv``` list of available parameters and their ids | ||
* ```qc_flags.csv``` | ||
* ```qc_scheme.csv``` | ||
* ```unit_view.csv``` list the available units and their ids (cf names, longnames and id) | ||
|
||
1. New parameters | ||
* needs to follow the IMOS vocabulary [BENE PLEASE UPDATE] | ||
|
||
1. map the parameters for your dataset collection | ||
* update ```parameters_mapping.csv```. This is the file where all the information from the other files is brought together, and where a variable name as written in the column name of the csv is matched to a unique id for each parameters find in ```parameters.csv```, units find in ```unit_view.csv```, ... | ||
|
||
1. Create view in Parameters mapping harvester: update the liquibase to update/include new views in the [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) | ||
* _(for beginners!)_ before starting the stack and open Talend, check the view to update/create on pgadmin (db-rc) as it is easier to get a better understanding of the query before updating the liquibase via Talend | ||
* start your stack restoring the paramaters_mapping schema and the schema you are working on | ||
```RestoreDatabaseSchemas: - schema: parameters_mapping, - schema: working_schema``` | ||
* start your pipeline box and Talend | ||
* update liquidbase ```Create parameters_mapping views``` sub-jod | ||
* remove from the liquidbase (temporally! do not include in commit!) create view on 6 views as each of these queries calls their respective dataset collections schema: | ||
`aatams_biologging_shearwater_metadata_summary`; | ||
`aatams_biologging_snowpetrel_metadata_summary`; | ||
`aatams_sattag_dm_metadata_summary`; | ||
`aatams_sattag_nrt_metadata_summary`; | ||
`aodn_nt_sattag_hawksbill_metadata_summary`; | ||
`aodn_nt_sattag_oliveridley_metadata_summary` | ||
* check stack database that the views are created as expected | ||
* test the full pipeline to see the integration of the parameters_mapping in the csv accessible from the stack portal | ||
|
||
1. merge the changes made in | ||
* [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) | ||
* [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) to test on RC before merging to production | ||
|
||
The [PARAMETERS_MAPPING harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) runs on a cron job daily , Monday to Friday. | ||
It harvests the content of these 5 files into the parameters_mapping DB schema and create a _metadata_summary view for each of the collection listed (it is not IMOS specific, for example we have a mapping for the AODN _WAVE_DM + NRT collections) | ||
|