Short Description

How to evaluate and select features and models. Cross-validation, ROC curves, feature engineering, the role of regularization. Automating these tasks with hyperparameter optimization.

Learning Outcomes

By the end of the course, students are expected to be able to:

  1. Explain how the concepts of generalization error and overfitting of training data are essential to the performance of a classification or regression model.
  2. Apply and use shrinkage and feature selection methods (e.g., Lasso, elastic nets).
  3. Perform k-fold cross validation and bootstrapping based on the training data.
  4. Evaluate the quality of a statistical model in order to do model/feature selection.
  5. Explain how the ROC is generated, and how the area under the ROC curve (AUC) can be used for comparing models.
  6. Diagnose/understand/address overfitting and underfitting.

Instructor (2016-2017)

Note: information on this page is preliminary and subject to change.