Skip to content

Commit

Permalink
Add deployment docs
Browse files Browse the repository at this point in the history
  • Loading branch information
penelopeysm committed Jul 2, 2024
1 parent 764ef06 commit f59a337
Showing 1 changed file with 35 additions and 0 deletions.
35 changes: 35 additions & 0 deletions docs/deployment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Deployment

This page explains how to generate and upload a complete set of metadata and data, which conforms to the [output specification](output_structure.md), to Azure blob storage.

## Overview

The deployment process is divided into two main steps:

1. **Fetching the data.** This means running the entire job for each country. This step does not publish anything to Azure; instead, it generates pickled data that is stored inside `$DAGSTER_HOME`.

2. **Uploading the data.** This means running each of the cloud sensor assets, which have the names `publish...`. There are four of these assets: one for the `countries.txt` file, one for the metadata structs, one for the geometries, and one for the metrics. These assets are tied to custom IO managers which read the pickled data from `$DAGSTER_HOME` and upload it to Azure blob storage.

Both of these steps are automated using the `popgetter.run` module, which is in turn invoked by the `deploy.sh` script in the repository root.

## Required environment variables

The deployment process requires the following environment variables to be set:

- `$POPGETTER_COUNTRIES`: A comma-separated list of country IDs to generate data for. This list also feeds into the `countries.txt` file.

- `$ENV`: Set this to `prod` to deploy to Azure. (You can set it to `dev` too, but this will publish the data to a local temporary directory, so it's only useful for testing the script.)

- `$SAS_TOKEN`: The SAS token for the Azure blob storage account. Contact a popgetter maintainer if you need this.

So, a typical deployment command might look like this:

```bash
POPGETTER_COUNTRIES=bel,gb_nir ENV=prod SAS_TOKEN="..." ./deploy.sh
```

Note that the `SAS_TOKEN` value must be quoted, because it contains ampersands which the shell will take to mean "run this command in the background".

## Where are the data stored?

The data and metadata will be uploaded to the ['popgetter' Azure storage account](https://portal.azure.com/#@turing.ac.uk/resource/subscriptions/06e7b12a-f395-4021-9fa2-5305fa01903e/resourceGroups/popgetter/providers/Microsoft.Storage/storageAccounts/popgetter/containersList) under the Urban Analytics Technology Platform subscription: specifically, it will be placed in the container named `prod`, and the directory corresponding to the current version of popgetter.

0 comments on commit f59a337

Please sign in to comment.