-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #119 from MetOffice/develop
Merge all changes from develop onto main before merging CORDEX changes
- Loading branch information
Showing
19 changed files
with
889 additions
and
99 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
FROM continuumio/miniconda3 | ||
|
||
RUN apt-get update | ||
|
||
# Set working directory for the project | ||
WORKDIR /app | ||
|
||
SHELL ["/bin/bash", "--login", "-c"] | ||
|
||
RUN apt-get install -y git | ||
|
||
# Create Conda environment from the YAML file | ||
COPY environment.yml . | ||
RUN pip install --upgrade pip | ||
|
||
RUN conda env create -f environment.yml | ||
|
||
RUN conda init bash | ||
RUN conda activate pyprecis-environment | ||
|
||
RUN pip install ipykernel && \ | ||
python -m ipykernel install --name pyprecis-training | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,17 @@ | ||
name: pyprecis-environment | ||
channels: | ||
- conda-forge | ||
- defaults | ||
dependencies: | ||
- python=3.6.6 | ||
- numpy | ||
- matplotlib | ||
- cartopy=0.16.0 | ||
- dask=0.19.4 | ||
- iris=2.2.0 | ||
dependencies: | ||
- python=3.6.10 | ||
- iris=2.4.0 | ||
- numpy=1.17.4 | ||
- matplotlib=3.1.3 | ||
- nc-time-axis=1.2.0 | ||
- jupyter_client=6.1.7 | ||
- jupyter_core=4.6.3 | ||
- dask=2.11.0 | ||
- notebook=5.7.8 | ||
- mo_pack=0.2.0 | ||
- boto3 | ||
- botocore | ||
- tqdm |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
|
||
## AWS | ||
|
||
### Create an EC2 instance | ||
|
||
* Select Eu-west2 (London) region from the top right of navigation bar | ||
* Click on Launch instance | ||
* Choose Amazon Linux 2 AMI (HVM) kARNEL 5.10 64-bit (- X86) machine, click select | ||
* Choose t2.2xlarge and click next: configure instance details | ||
* Choose subnet default eu-west-2c | ||
* In IAM role choose existing trainings-ec2-dev role and click next: storage | ||
* 8 gb is fine, click next: add tags | ||
* Add following tags | ||
* Name: [Unique Instance name] | ||
* Tenable: FA | ||
* ServiceOwner: [firstname.lastname] | ||
* ServiceCode: PABCLT | ||
* add securitygroup, select an existing security group: IAStrainings-ec2-mo | ||
* Review and Launch and then select launch | ||
* It will prompt to set a key pair (to allow ssh). create a new key and download it. | ||
|
||
It will create the instance. To see the running instance goto instances and instacne state will be "Running" | ||
|
||
### SSH instance on VDI | ||
|
||
|
||
* Save the key (.pem) to .ssh and set the permission: chmod 0400 ~/.ssh/your_key.pem | ||
* Open ~/.ssh/config and add following: | ||
|
||
``` | ||
Host ec2-*.eu-west-2.compute.amazonaws.com | ||
IdentityFile ~/.ssh/your_key.pem | ||
User ec2-user | ||
``` | ||
|
||
* Find the public IPv4 DNS and ssh in using it ssh ec2-<ip address>.eu-west-2.compute.amazonaws.com, public IPv4 DNS can be found in instance detail on AWS. Click on your instance and it will open the details. | ||
|
||
* Remember to shutdown the instance when not using it. It will save the cost. | ||
### create s3 bucket | ||
|
||
* goto s3 service and press "create bucket" | ||
* name the bucket | ||
* set region to EU (London) eu-west-2 | ||
* add tags: | ||
* Name: [name of bucket or any unique name] | ||
* ServiceOwner: [your-name] | ||
* ServiceCode: PABCLT | ||
* Tenable: FA | ||
* click on "create bucket" | ||
|
||
### Key configurations | ||
|
||
|
||
The above script run only when config files contains latest keys. In order to update the keys: | ||
|
||
* go to AB climate training dev --> Administrator access --> command line or programmatic access | ||
* Copy keys in "Option 1: Set AWS environment variables" | ||
* In VDI, paste (/replace existing) these keys in ~/.aws/config | ||
* add [default] in first line | ||
* Copy keys in "Option 2: Add a profile to your AWS credentials file" | ||
* In VDI, Paste the keys in credentials file: ~/.aws/credentials (remove the first copied line, looks somethings like: [198477955030_AdministratorAccess]) | ||
* add [default] in first line | ||
|
||
The config and credentials file should look like (with own keys): | ||
|
||
``` | ||
[default] | ||
export AWS_ACCESS_KEY_ID="ASIAS4NRVH7LD2RRGSFB" | ||
export AWS_SECRET_ACCESS_KEY="rpI/dxzQWhCul8ZHd18n1VW1FWjc0LxoKeGO50oM" | ||
export AWS_SESSION_TOKEN="IQoJb3JpZ2luX2VjEGkaCWV1LXdlc3QtMiJH" | ||
``` | ||
|
||
### Loading data on s3 bucket from VDI (using boto3) | ||
|
||
to upload the file(s) on S3 use: /aws-scripts/s3_file_upload.py | ||
to upload the directory(s) on S3 use: /aws-scripts/s3_bulk_data_upload.py | ||
|
||
### AWS Elastic container repository | ||
|
||
Following instructions are for creating image repo on ECR and uploading container image | ||
|
||
* ssh to the previously created EC2 instance, make an empty Git repo: | ||
|
||
``` | ||
sudo yum install -y git | ||
git init | ||
``` | ||
* On VDI, run the following command to push the PyPrecis repo containing the docker file to the EC2 instance: | ||
``` | ||
git push <ec2 host name>:~ | ||
``` | ||
|
||
* Now checkout the branch on EC2: git checkout [branch-name] | ||
* Install docker and start docker service | ||
|
||
``` | ||
sudo amazon-linux-extras install docker | ||
sudo service docker start | ||
``` | ||
|
||
* build docker image: | ||
|
||
``` | ||
sudo docker build . | ||
``` | ||
|
||
* goto AWS ECR console and "create repository", make it private and name it | ||
|
||
* Once created, press "push commands" | ||
|
||
* copy the command and run it on EC2 instance, it will push the container image on record. if get "permission denied" error, please add "sudo" before "docker" in the command. | ||
|
||
|
||
|
||
### AWS Sagemaker: Run notebook using custom kernel | ||
The instructions below follow the following tutorial: | ||
https://aws.amazon.com/blogs/machine-learning/bringing-your-own-custom-container-image-to-amazon-sagemaker-studio-notebooks/ | ||
|
||
* goto Sagemaker and "open sagemaker domain" | ||
* add user | ||
* Name and and select Amazonsagemaker-executionrole (dafult one) | ||
|
||
* Once user is created, goto "attach image" | ||
* Select "New Image" and add image URI (copy from image repo) | ||
* Give new image name, display name, sagmaker-executionrole and add tags and attach the image | ||
* add kernel name and display name (both can be same) | ||
* Now, launch app -> Studio and it will open the Notebook dashboard. | ||
* Select python notebook and add your custom named Kernel |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
|
||
import io | ||
import os | ||
import boto3 | ||
from urllib.parse import urlparse | ||
from fnmatch import fnmatch | ||
from shutil import copyfile | ||
|
||
|
||
def _fetch_s3_file(s3_uri, save_to): | ||
|
||
bucket_name, key = _split_s3_uri(s3_uri) | ||
print(f"Fetching s3 object {key} from bucket {bucket_name}") | ||
|
||
client = boto3.client("s3") | ||
obj = client.get_object( | ||
Bucket=bucket_name, | ||
Key=key, | ||
) | ||
with io.FileIO(save_to, "w") as f: | ||
for i in obj["Body"]: | ||
f.write(i) | ||
|
||
|
||
def _save_s3_file(s3_uri, out_filename, file_to_save="/tmp/tmp"): | ||
bucket, folder = _split_s3_uri(s3_uri) | ||
out_filepath = os.path.join(folder, out_filename) | ||
print(f"Save s3 object {out_filepath} to bucket {bucket}") | ||
client = boto3.client("s3") | ||
client.upload_file( | ||
Filename=file_to_save, | ||
Bucket=bucket, | ||
Key=out_filepath | ||
) | ||
|
||
|
||
def _split_s3_uri(s3_uri): | ||
parsed_uri = urlparse(s3_uri) | ||
return parsed_uri.netloc, parsed_uri.path[1:] | ||
|
||
|
||
def find_matching_s3_keys(in_fileglob): | ||
|
||
bucket_name, file_and_folder_name = _split_s3_uri(in_fileglob) | ||
folder_name = os.path.split(file_and_folder_name)[0] | ||
all_key_responses = _get_all_files_in_s3_folder(bucket_name, folder_name) | ||
matching_keys = [] | ||
for key in [k["Key"] for k in all_key_responses]: | ||
if fnmatch(key, file_and_folder_name): | ||
matching_keys.append(key) | ||
return matching_keys | ||
|
||
|
||
def _get_all_files_in_s3_folder(bucket_name, folder_name): | ||
client = boto3.client("s3") | ||
response = client.list_objects_v2( | ||
Bucket=bucket_name, | ||
Prefix=folder_name, | ||
) | ||
all_key_responses = [] | ||
if "Contents" in response: | ||
all_key_responses = response["Contents"] | ||
while response["IsTruncated"]: | ||
continuation_token = response["NextContinuationToken"] | ||
response = client.list_objects_v2( | ||
Bucket=bucket_name, | ||
Prefix=folder_name, | ||
ContinuationToken=continuation_token, | ||
) | ||
if "Contents" in response: | ||
all_key_responses += response["Contents"] | ||
return all_key_responses | ||
|
||
|
||
def copy_s3_files(in_fileglob, out_folder): | ||
''' | ||
This function copy files from s3 bucket to local directory. | ||
args | ||
--- | ||
in_fileglob: s3 uri of flies (wild card can be used) | ||
out_folder: local path where data will be stored | ||
''' | ||
matching_keys = find_matching_s3_keys(in_fileglob) | ||
in_bucket_name = _split_s3_uri(in_fileglob)[0] | ||
out_scheme = urlparse(out_folder).scheme | ||
for key in matching_keys: | ||
new_filename = os.path.split(key)[1] | ||
temp_filename = os.path.join("/tmp", new_filename) | ||
in_s3_uri = os.path.join(f"s3://{in_bucket_name}", key) | ||
_fetch_s3_file(in_s3_uri, temp_filename) | ||
if out_scheme == "s3": | ||
_save_s3_file( | ||
out_folder, | ||
new_filename, | ||
temp_filename, | ||
) | ||
else: | ||
copyfile( | ||
temp_filename, os.path.join(out_folder, new_filename) | ||
) | ||
os.remove(temp_filename) | ||
|
||
|
||
def main(): | ||
in_fileglob = 's3://ias-pyprecis/data/cmip5/*.nc' | ||
out_folder = '/home/h01/zmaalick/myprojs/PyPRECIS/aws-scripts' | ||
copy_s3_files(in_fileglob, out_folder) | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
Oops, something went wrong.