Dask-ML Workshop

Introduces participants to Dask-ML for scaling standard Python machine learning tools (e.g., Scikit-Learn, XGBoost). Participants apply various pre-built models on moderate-to-large datasets to learn best practices for parallel & out-of-core machine learning.

3 hours of instruction

Introduces participants to Dask-ML for scaling standard Python machine learning tools (e.g., Scikit-Learn, XGBoost). Participants apply various pre-built models on moderate-to-large datasets to learn best practices for parallel & out-of-core machine learning.

PREREQUISITES

Participants should have prior experience using the Python language and, in particular, using standard Python tools for data analysis (notably NumPy, Pandas, Jupyter). Participants should also have some prior exposure to Scikit-Learn for machine learning and to Dask for scaling data analysis in Python.

LEARNING OBJECTIVES

  1. ​Deploy incremental learning with partial fit models for large datasets
  2. ​Exploit parallelism for cross-validation and hyperparameter grid-search using standard Python idioms
  3. ​Implement parallel prediction using Dask-ML meta-estimators
  4. ​Scale out linear models (e.g., Linear/Logistic Regressors) & XGBoost with Dask-ML

Not Enrolled
This course is currently closed