Skip to content

CSDMS 2019 Pangeo Clinic

Joe Hamman edited this page May 21, 2019 · 3 revisions

Welcome to the 2019 CSDMS Pangeo Clinic! Here's some useful information for our session:

  • Location: University of Colorado, Sustainability Energy and Environment Complex, 4001 Discovery Drive, C120 Boulder, Colorado 80309
  • Room: N124
  • Time: May 21, 2019 1:30-3:30p
  • Clinic Leader: Joe Hamman, NCAR

Pangeo: Scalable Geoscience Tools in Python –Xarray, Dask, and Jupyter

Earth scientists face serious challenges when working with large datasets. Pangeo is a rapidly growing community initiative and open source software ecosystem for scalable geoscience using Python. Three of Pangeo’s core packages are 1) Jupyter, a web-based tool for interactive computing, 2) Xarray, a data-model and toolkit for working with N-dimensional labeled arrays, and 3) Dask, a flexible parallel computing library. When combined with distributed computing, these tools can help geoscientists perform interactive analysis on datasets up to petabytes in size. In this interactive tutorial we will demonstrate how to employ this platform using real science examples from hydrology, remote sensing, and oceanography. Participants will follow along using Jupyter notebooks to interact with Xarray and Dask running in Google Cloud Platform.

Clinic Materials

Agenda

This is an approximate agenda, we may flex as appropriate*

  • Welcome and introductions: 10 minutes
  • Overview of Pangeo: 10 minutes
  • Xarray: 30 minutes
  • Dask: 30 minutes
  • Sample science applications: 30 minutes
  • Wrap up / questions: 10 minutes

Afterwards: Please fill out this survey.