Skip to content

Commit

Permalink
Merge pull request #89 from CHIMEFRB/86-feature-datatrail-scout-creat…
Browse files Browse the repository at this point in the history
…e-and-submit-healing-payload-for-files-missing-minoc-replica

86 feature datatrail scout create and submit healing payload for files missing minoc replica
  • Loading branch information
tjzegmott authored Jun 6, 2024
2 parents 808a9ff + 7f633b1 commit 73171dc
Show file tree
Hide file tree
Showing 5 changed files with 152 additions and 10 deletions.
17 changes: 10 additions & 7 deletions docs/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,18 @@

The commands available to you are:

- `list`: This list either the 'scopes' available or all of the datasets
belonging to the given dataset.
- `ps`: This provides detailed information for the given 'scope' and 'dataset'
combination.
- `pull`: This allows you to download all files belonging to the 'scope' and
'dataset' provided.
- `clear`: This removes all files belonging to the 'scope' and 'dataset', only
available for the local and canfar sites.
available for the local and canfar sites.
- `config`: Edit the `.datatrail/config.yaml` configuration file.
- `list`: This list either the 'scopes' available or all of the datasets
belonging to the given dataset.
- `ps`: This provides detailed information for the given 'scope' and 'dataset' combination.
- `pull`: This allows you to download all files belonging to the 'scope' and
'dataset' provided.
- `scout`: This command provides an overview of what the Datatrail database
thinks is the current number of files for a given dataset at each storage
element, compared to what is observed. If a discrepancy is found at Minoc,
the user can choose to create the file replicas missing for Minoc.
- `version`: List the CLI and server version.

Detailed information on all of the CLI commands can be found on the
Expand Down
108 changes: 108 additions & 0 deletions docs/scout.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# 🕵️ Investigating a dataset with scout

<!-- termynal -->
```bash
❯ datatrail scout --help
Usage: datatrail scout [OPTIONS] DATASET [SCOPES]...

Scout a dataset.

Options:
-v, --verbose Verbosity: v=INFO, vv=DEBUG.
-q, --quiet Set log level to ERROR.
--help Show this message and exit.
```

## Overview

The purpose of this function is the give users easy visibility into the
current situation for a given dataset across all of Datatrail's storage
elements. The number of datasets that information is given for depends on the
number of scopes that the given dataset name has registered. However, this
can be filtered by providing a list of scopes to the command.

## Usage

Below is an example of the output for the dataset named `382085503`, both
filtered to only show information for the `chime.event.baseband.raw` scope
and unfiltered.

!!! note
The output below does not show the correct colouring. The rows of the table
are colour-coded to indicate if it is observed or expected. Observed
values are displayed in blue and expected values are in yellow.

=== "Filtering by scope"

```bash
❯ datatrail scout 382085503 chime.event.baseband.raw
Scout Results for 382085503
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━┳━━━━━┳━━━━━┳━━━━━━━┓
┃ Scope ┃ chime ┃ baseband_buffer ┃ kko ┃ gbo ┃ hco ┃ minoc ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━╇━━━━━╇━━━━━╇━━━━━━━┩
│ chime.event.baseband.raw │ 824 │ 0 │ 0 │ -1 │ -1 │ 824 │ # (1)!
│ chime.event.baseband.raw │ 824 │ 0 │ 0 │ 0 │ 0 │ 824 │ # (2)!
└──────────────────────────┴───────┴─────────────────┴─────┴─────┴─────┴───────┘
Legend: Observed, Expected
NOTE: In the case where more files are expected at a site other than minoc, that
this may be due to the file type filtering when querying each site. This is a
known limitation of the current implementation.
```

1. The Observed number of files.
2. The Expected number of files.

=== "Unfiltered"

```bash
❯ datatrail scout 382085503
Scout Results for 382085503
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━┳━━━━━━┳━━━━━┳━━━━━━━┓
┃ Scope ┃ chime ┃ baseband_buffer ┃ kko ┃ gbo ┃ hco ┃ minoc ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━╇━━━━━━╇━━━━━╇━━━━━━━┩
│ chime.event.baseband.raw │ 824 │ 0 │ 0 │ -1 │ -1 │ 824 │
│ chime.event.baseband.raw │ 824 │ 0 │ 0 │ 0 │ 0 │ 824 │
├───────────────────────────┼───────┼─────────────────┼─────┼──────┼─────┼───────┤
│ chime.event.intensity.raw │ 162 │ 0 │ 0 │ -1 │ -1 │ 164 │
│ chime.event.intensity.raw │ 164 │ 0 │ 0 │ 0 │ 0 │ 164 │
├───────────────────────────┼───────┼─────────────────┼─────┼──────┼─────┼───────┤
│ gbo.event.baseband.raw │ 0 │ 0 │ 0 │ -1 │ -1 │ 1024 │
│ gbo.event.baseband.raw │ 0 │ 0 │ 0 │ 1024 │ 0 │ 1024 │
└───────────────────────────┴───────┴─────────────────┴─────┴──────┴─────┴───────┘
Legend: Observed, Expected
NOTE: In the case where more files are expected at a site other than minoc, that this may
be due to the file type filtering when querying each site. This is a known limitation of
the current implementation.
```

!!! failure "Negative files"
If the server encounters an error it is represented as a negative number.
Which can occur when communicating with the mini-servers running at each
storage element.

### Healing at Minoc
In some cases, the number of files expected at minoc may be less than the number
that actually exist there. This can occur when API requests drop, leading to an
inconsistent state in the database. When this is seen by `scout`, the command
offers to remedy the situation by adding the missing replicas.

```bash
❯ datatrail scout 383577603 chime.event.baseband.raw
Scout Results for 383577603
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━┳━━━━━┳━━━━━┳━━━━━━━┓
┃ Scope ┃ chime ┃ baseband_buffer ┃ kko ┃ gbo ┃ hco ┃ minoc ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━╇━━━━━╇━━━━━╇━━━━━━━┩
│ chime.event.baseband.raw │ 702 │ 0 │ 0 │ -1 │ -1 │ 702 │
│ chime.event.baseband.raw │ 702 │ 0 │ 0 │ 0 │ 0 │ 699 │
└──────────────────────────┴───────┴─────────────────┴─────┴─────┴─────┴───────┘
Legend: Observed, Expected
NOTE: In the case where more files are expected at a site other than minoc, that this may
be due to the file type filtering when querying each site. This is a known limitation of
the current implementation.

Scopes with minoc discrepancy:
- chime.event.baseband.raw

Would you like to attempt to heal this discrepancy? [y/n]: y
chime.event.baseband.raw - Healing successful.
```
8 changes: 6 additions & 2 deletions docs/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ How to install Datatrail CLI.

Performing the initial setup in order to use the Datatrail CLI.

## 🗑️ [clear](clear.md)

Deleting a dataset.

## 🗒️ [list](list.md)

Searching the Datatrail database for scopes and datasets.
Expand All @@ -24,6 +28,6 @@ Querying the Datatrail database for details about a dataset.

Downloading a dataset.

## 🗑️ [clear](clear.md)
## 🕵️ [scout](scout.md)

Deleting a dataset.
Investigating number of files for a dataset across storage elements.
26 changes: 26 additions & 0 deletions dtcli/scout.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import requests
from cadcutils.exceptions import BadRequestException
from rich.console import Console
from rich.prompt import Confirm
from rich.table import Table

from dtcli.config import procure
Expand Down Expand Up @@ -101,6 +102,7 @@ def scout( # noqa: C901
error_console.print(data["error"])
return None

scopes_with_minoc_discrepancy: List[str] = []
for scope in data.keys():
basepath = data.get(scope).get("basepath")
query = f"select count(*) from inventory.Artifact where uri like 'cadc:CHIMEFRB/{basepath}%'" # noqa: E501
Expand All @@ -126,8 +128,31 @@ def scout( # noqa: C901
for key in keys_missing_in_expected:
data[scope]["expected"][key] = 0

if data[scope]["observed"]["minoc"] > data[scope]["expected"]["minoc"]:
scopes_with_minoc_discrepancy.append(scope)

show_scout_results(dataset, data)

if scopes_with_minoc_discrepancy:
error_console.print("Scopes with minoc discrepancy:")
for scope in scopes_with_minoc_discrepancy:
error_console.print(f" - {scope}")
ifHeal = Confirm.ask("\nWould you like to attempt to heal this discrepancy?")
if ifHeal:
basepath = data.get(scope).get("basepath")
minoc_md5s = cadcclient.dataset_md5s(basepath)
# console.print(minoc_md5s)
url = (
server
+ "/commit/dataset/scout/sync"
+ f"?name={dataset}&scope={scope}&replicate_to=minoc"
)
response = requests.post(url, json=minoc_md5s)
if response.status_code == 200:
console.print(f"{scope} - Healing successful.")
else:
error_console.print(f"{scope} - Healing failed.")


def show_scout_results(dataset: str, data: dict):
"""Create and display a table with scout results.
Expand Down Expand Up @@ -168,3 +193,4 @@ def show_scout_results(dataset: str, data: dict):
minoc, that this may be due to the file type filtering when querying each site. This \
is a known limitation of the current implementation.",
)
console.print()
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,10 +81,11 @@ nav:
- Install: install.md
- Initialise: initialising.md
- Commands:
- clear: clear.md
- list: list.md
- ps: ps.md
- pull: pull.md
- clear: clear.md
- scout: scout.md
- Command Line Interface:
- Commands: commands.md
- Reference: cli.md

0 comments on commit 73171dc

Please sign in to comment.