DSCI 573

Short Description

How to evaluate and select features and models. Cross-validation, ROC curves, feature engineering, the role of regularization. Automating these tasks with hyperparameter optimization.

Learning Outcomes

By the end of the course, students are expected to be able to:

Explain how the concepts of generalization error and overfitting of training data are essential to the performance of a classification or regression model.
Apply and use shrinkage and feature selection methods (e.g., Lasso, elastic nets).
Perform k-fold cross validation and bootstrapping based on the training data.
Evaluate the quality of a statistical model in order to do model/feature selection.
Explain how the ROC is generated, and how the area under the ROC curve (AUC) can be used for comparing models.
Diagnose/understand/address overfitting and underfitting.

Instructor (2016-2017)

Mark Schmidt

Note: information on this page is preliminary and subject to change.