Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fetch job and update stage_ic to work with fetched ICs #3141

Open
wants to merge 45 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
fdb996d
Stage_ic updates: GH2988
DavidGrumm-NOAA Nov 27, 2024
a1f03ec
Removed white space
DavidGrumm-NOAA Dec 10, 2024
17f7e6c
remove more whitespace
DavidGrumm-NOAA Dec 10, 2024
a9bd3e2
Incorporate fetch into GW
DavidGrumm-NOAA Dec 12, 2024
ba1a017
Merge branch 'develop' into stage_ic_2988
DavidGrumm-NOAA Dec 13, 2024
e624872
Fix whitespace
DavidGrumm-NOAA Dec 13, 2024
fa35c61
Reconcile divergent branches. Merge branch 'stage_ic_2988' of github.…
DavidGrumm-NOAA Dec 13, 2024
f21f96b
Move the fetch options to the run_options dict
DavidGrumm-NOAA Dec 16, 2024
20f2862
Additonal code to incorporate fetch, including modifying some of the …
DavidGrumm-NOAA Dec 18, 2024
5ca1531
Updated fetch code to use fetch directory FETCHDIR instead of ATARDIR…
DavidGrumm-NOAA Dec 18, 2024
09ce49c
Deleted ci/cases/pr/ATM_cold.yaml
DavidGrumm-NOAA Dec 19, 2024
c71b340
modifying stage_ic and fetch code to ensure that both function compat…
DavidGrumm-NOAA Dec 20, 2024
a6dade0
Moved setting of cycle_YMDH
DavidGrumm-NOAA Dec 23, 2024
a6a84d4
Updates for the fetch and yaml code, and keys.
DavidGrumm-NOAA Jan 4, 2025
8aead59
Corrected typo
DavidGrumm-NOAA Jan 7, 2025
ecc1b48
Add specification of fetch directory, corrrected name of yaml file.
DavidGrumm-NOAA Jan 10, 2025
dd645eb
Add DO_FETCH_HPSS setting to all non-hera files in workflow/hosts
DavidGrumm-NOAA Jan 10, 2025
4ca9d49
Change pass to raise an error
DavidGrumm-NOAA Jan 10, 2025
df88c29
whitespace repair
DavidGrumm-NOAA Jan 10, 2025
dd2b726
Merge branch 'develop' into stage_ic_2988
DavidGrumm-NOAA Jan 10, 2025
89c42ef
Add S2SW_cold fetch template
DavidHuber-NOAA Jan 13, 2025
d0e6f9b
Update ATM_cold.yaml.j2
DavidHuber-NOAA Jan 13, 2025
701093c
Add YMDH to fetch_yamls
DavidHuber-NOAA Jan 13, 2025
9433473
Add previous_cycle to fetch dict
DavidHuber-NOAA Jan 13, 2025
04f8da9
Check for missing files after untarring, add logging
DavidHuber-NOAA Jan 13, 2025
d1f2a00
Renamed fetch configs
DavidHuber-NOAA Jan 13, 2025
34ae4a8
Replace gefs config.fetch with link to gfs
DavidHuber-NOAA Jan 13, 2025
5c25f26
Update fetch directory
DavidHuber-NOAA Jan 13, 2025
35f2454
Restrict fetch cases to C48_S2SW and C48_ATM
DavidHuber-NOAA Jan 13, 2025
57d01ea
Merge pull request #1 from DavidHuber-NOAA/new_fetch_yaml
DavidGrumm-NOAA Jan 13, 2025
b929200
Cleanup of env files
DavidGrumm-NOAA Jan 17, 2025
8e3be40
Merge branch 'develop' into stage_ic_2988
DavidGrumm-NOAA Jan 17, 2025
23b2057
Renamed variable, removed unused code
DavidGrumm-NOAA Jan 17, 2025
bab00e5
need to pull changes from remote. Merge branch 'stage_ic_2988' of git…
DavidGrumm-NOAA Jan 21, 2025
ac86509
Merged in develop
DavidGrumm-NOAA Jan 23, 2025
60241bd
Merged in develop
DavidGrumm-NOAA Jan 23, 2025
00e3d62
Undo merge mangling
DavidGrumm-NOAA Jan 23, 2025
e7396bc
Adding higher resolutions back
DavidGrumm-NOAA Jan 23, 2025
eba7841
Removing higher resolutions (they will be added in a different PR)
DavidGrumm-NOAA Jan 23, 2025
8644449
Address additional reviewer comments
DavidGrumm-NOAA Jan 23, 2025
b7c919e
Remove some fetch options for now
DavidGrumm-NOAA Jan 23, 2025
aae3e6e
Address reviewer comments(.venv) [David.Grumm@hfe10 G_WF_2988]$ git a…
DavidGrumm-NOAA Jan 24, 2025
2eec8da
Remove extraneous new lines
DavidGrumm-NOAA Jan 24, 2025
939ce24
Merge branch 'develop' into stage_ic_2988
DavidHuber-NOAA Jan 24, 2025
7afe68c
Delete updated submodules to placate git pull. Merge branch 'stage_ic…
DavidGrumm-NOAA Jan 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions jobs/JGLOBAL_FETCH
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#! /usr/bin/env bash

source "${HOMEgfs}/ush/preamble.sh"
source "${HOMEgfs}/ush/jjob_header.sh" -e "fetch" -c "base fetch"

# Execute fetching
"${SCRgfs}/exglobal_fetch.py"
err=$?

###############################################################
# Check for errors and exit if any of the above failed
if [[ "${err}" -ne 0 ]]; then
echo "FATAL ERROR: Unable to fetch ICs to ${ROTDIR}; ABORT!"
exit "${err}"
fi

##########################################
# Remove the Temporary working directory
##########################################
cd "${DATAROOT}" || (echo "${DATAROOT} does not exist. ABORT!"; exit 1)
[[ ${KEEPDATA} = "NO" ]] && rm -rf "${DATA}"

exit 0
38 changes: 38 additions & 0 deletions parm/config/gefs/config.fetch
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#! /usr/bin/env bash

########## config.fetch ##########

echo "BEGIN: config.fetch"

# Get task specific resources
source "${EXPDIR}/config.resources" fetch

export ICSDIR="@ICSDIR@" # User provided ICSDIR; blank if not provided
export BASE_IC="@BASE_IC@" # Platform home for staged ICs

export FETCH_YAML_TMPL="${PARMgfs}/stage/master_gefs.yaml.j2"
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
export fetch_yaml="stage_atm_cold.yaml"

# Set ICSDIR (if not defined)

if [[ -z "${ICSDIR}" ]] ; then

ic_ver="20240610"

if (( NMEM_ENS > 0 )) ; then
ensic="${CASE_ENS}"
fi

if [[ "${DO_OCN:-NO}" == "YES" ]] ; then
ocnic="mx${OCNRES}"
fi

export ICSDIR="${BASE_IC}/${CASE}${ensic:-}${ocnic:-}/${ic_ver}"

fi

# Use of perturbations files for ensembles
export USE_OCN_ENS_PERTURB_FILES="NO"
export USE_ATM_ENS_PERTURB_FILES="NO"
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved

echo "END: config.fetch"
37 changes: 37 additions & 0 deletions parm/config/gfs/config.fetch
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#! /usr/bin/env bash

########## config.fetch ##########

echo "BEGIN: config.fetch"

# Get task specific resources
source "${EXPDIR}/config.resources" fetch
WalterKolczynski-NOAA marked this conversation as resolved.
Show resolved Hide resolved

export ICSDIR="@ICSDIR@" # User provided ICSDIR; blank if not provided
export BASE_IC="@BASE_IC@" # Platform home for staged ICs

export FETCH_YAML_TMPL="${PARMgfs}/stage/master_gfs.yaml.j2"
export fetch_yaml="stage_atm_cold.yaml"

# Set ICSDIR (if not defined)
if [[ -z "${ICSDIR}" ]] ; then

ic_ver="20240610"

if (( NMEM_ENS > 0 )) ; then
ensic="${CASE_ENS}"
fi

if [[ "${DO_OCN:-NO}" == "YES" ]] ; then
ocnic="mx${OCNRES}"
fi

export ICSDIR="${BASE_IC}/${CASE}${ensic:-}${ocnic:-}/${ic_ver}"

fi

# Use of perturbations files for ensembles
export USE_OCN_ENS_PERTURB_FILES="NO"
export USE_ATM_ENS_PERTURB_FILES="NO"
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved

echo "END: config.fetch"
19 changes: 19 additions & 0 deletions parm/fetch/stage_atm_cold.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
untar:
tarball : "{{ ATARDIR }}/{{ cycle_YMDH }}/atm_cold.tar"
on_hpss: True
contents:
- gfs_ctrl.nc
{% for ftype in ["gfs_data", "sfc_data"] %}
{% for ntile in range(1, ntiles + 1) %}
- {{ ftype }}.tile{{ ntile }}.nc
{% endfor %} # ntile
{% endfor %} # ftype
destination: "{{ DATA }}"
atmosphere_cold:
copy:
- ["{{ DATA }}/gfs_ctrl.nc", "{{ COMOUT_ATMOS_INPUT }}"]
{% for ftype in ["gfs_data", "sfc_data"] %}
{% for ntile in range(1, ntiles + 1) %}
- ["{{ DATA }}/{{ ftype }}.tile{{ ntile }}.nc", "{{ COMOUT_ATMOS_INPUT }}"]
{% endfor %} # ntile
{% endfor %} # ftype
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
45 changes: 45 additions & 0 deletions scripts/exglobal_fetch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/usr/bin/env python3

import os

from fetch import Fetch
from wxflow import AttrDict, Logger, cast_strdict_as_dtypedict, logit

# initialize root logger
logger = Logger(level=os.environ.get("LOGGING_LEVEL", "DEBUG"), colored_log=True)


@logit(logger)
def main():

config = cast_strdict_as_dtypedict(os.environ)

# Instantiate the Fetch object
fetch = Fetch(config)

# Pull out all the configuration keys needed to run the fetch step
keys = ['current_cycle', 'RUN', 'PDY', 'PARMgfs', 'PSLOT', 'ROTDIR', 'fetch_yaml', 'ATARDIR', 'ntiles']
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved

fetch_dict = AttrDict()
for key in keys:
fetch_dict[key] = fetch.task_config.get(key)
if fetch_dict[key] is None:
print(f"Warning: key ({key}) not found in task_config!")

# Also import all COMOUT* directory and template variables
for key in fetch.task_config.keys():
if key.startswith("COMOUT_"):
fetch_dict[key] = fetch.task_config.get(key)
if fetch_dict[key] is None:
print(f"Warning: key ({key}) not found in task_config!")

DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
# Determine which archives to retrieve from HPSS
# Read the input YAML file to get the list of tarballs on tape
atardir_set = fetch.configure(fetch_dict)

# Pull the data from tape or locally and store the specified destination
fetch.execute_pull_data(atardir_set)
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved


if __name__ == '__main__':
main()
98 changes: 98 additions & 0 deletions ush/python/pygfs/task/fetch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
#!/usr/bin/env python3

import os
from logging import getLogger
from typing import Any, Dict, List

from wxflow import (AttrDict, FileHandler, Hsi, Task,
logit, parse_j2yaml)
from wxflow import htar as Htar
import tarfile
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved


logger = getLogger(__name__.split('.')[-1])


class Fetch(Task):
"""Task to pull ROTDIR data from HPSS (or locally)
"""

@logit(logger, name="Fetch")
def __init__(self, config: Dict[str, Any]) -> None:
"""Constructor for the Fetch task
The constructor is responsible for collecting necessary yamls based on
the runtime options and RUN.

Parameters
----------
config : Dict[str, Any]
Incoming configuration for the task from the environment

Returns
-------
None
"""
super().__init__(config)

DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
# Perhaps add other stuff to self.
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved

@logit(logger)
def configure(self, fetch_dict: Dict[str, Any]):
"""Determine which tarballs will need to be extracted

Parameters
----------
fetch_dict : Dict[str, Any]
Task specific keys, e.g. COM directories, etc

Return
------
parsed_fetch: Dict[str, Any]
Dictionary derived from the yaml file with necessary HPSS info.
"""

self.hsi = Hsi()

fetch_yaml = fetch_dict.fetch_yaml
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
fetch_parm = os.path.join(fetch_dict.PARMgfs, "fetch")

parsed_fetch = parse_j2yaml(os.path.join(fetch_parm, fetch_yaml),
fetch_dict)
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved

DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
return parsed_fetch

@logit(logger)
def execute_pull_data(self, atardir_set: Dict[str, Any]) -> None:
"""Pull data from HPSS based on a yaml dictionary and store at the
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
specified destination.

Parameters
----------
atardir_set: Dict[str, Any],
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
Dict defining set of tarballs to pull and where to put them.

Return
None
"""
if len(f_names) <= 0: # Abort if no files
raise FileNotFoundError("FATAL ERROR: The tar ball has no files") # DG - add name
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved

DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
f_names = atardir_set.untar.contents
on_hpss = atardir_set.untar.on_hpss
dest = atardir_set.untar.destination

DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
# Select action whether no_hpss is True or not, and pull these data from
# tape or locally and place where it needs to go
# DG - these need testing
if on_hpss is True: # htar all files in fnames
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
htar_obj = Htar.Htar()
htar_obj.cvf(dest, f_names)

else: # tar all files in fnames
DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
with tarfile.open(dest, "w") as tar:
for filename in f_names:
tar.add(filename)


# Other helper methods...

DavidHuber-NOAA marked this conversation as resolved.
Show resolved Hide resolved
Loading