Skip to content

Latest commit

 

History

History

csw

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

CSW

Catalogue service for the web CSW is a standardised protocol to query remote catalogues.

This library can be used to fetch records from CSW servers.

Some relevant CSW catalogues are:

The script uses the owslib library to fetch records and stores them on a PostGreSQL database table harvest.items with structure

CREATE TABLE IF NOT EXISTS harvest.items
(
    identifier text COLLATE pg_catalog."default" NOT NULL,
    identifiertype character varying(50) COLLATE pg_catalog."default",
    itemtype character varying(50) COLLATE pg_catalog."default",
    resultobject text COLLATE pg_catalog."default" NOT NULL,
    resulttype character varying(50) COLLATE pg_catalog."default",
    uri text COLLATE pg_catalog."default" NOT NULL,
    insert_date timestamp without time zone,
    source text COLLATE pg_catalog."default",
    hash text COLLATE pg_catalog."default",
    turtle text COLLATE pg_catalog."default",
    date character varying(10) COLLATE pg_catalog."default",
    error text COLLATE pg_catalog."default",
    language character varying(9) COLLATE pg_catalog."default",
    project text COLLATE pg_catalog."default",
    CONSTRAINT item_hash UNIQUE (hash)
)

A harvester run is best configured as a CI-CD pipeline in GIT

Environment variables

environment variables can also be added to a .env file

  • POSTGRES_HOST
  • POSTGRES_PORT
  • POSTGRES_DB
  • POSTGRES_USER
  • POSTGRES_PASSWORD
  • HARVEST_URL
  • HARVEST_FILTER

HARVEST_FILTER syntax

Format json, key-value pairs:

export HARVEST_FILTER='{"keywords":"Soil","type":"dataset"}'