3 hours of instruction
Introduces participants to the Xarray project for manipulating multi-channel data (e.g., as occur commonly in geosciences, etc.). Participants practice using Xarray for data analysis extending techniques from Pandas & NumPy to high-dimensional labeled arrays.
PREREQUISITES
Participants should have a decent working knowledge of the Python language and, in particular, using standard Python tools for data analysis (notably NumPy, Pandas, Jupyter). Prior experience working with NETCDF, HDF, or related file formats for representing scientific data sets is useful but not required. Some familiarity with Dask is also useful for some examples explored.
LEARNING OBJECTIVES
- Select & apply appropriate Xarray structures (i.e., DataArrays, DataSets) for a given computational problem
- Persist or ingest Xarray core data structures using various standard file formats
- Explore & analyze high-dimensional Xarray labeled data with NumPy- or Pandas-style operations (e.g., groupby, indexing, selection, broadcasting, etc.)
- Design Dask-based pipelines with Xarray for out-of-core computation on large datasets
Login
Accessing this course requires a login. Please enter your credentials below!