How to evaluate and select features and models. Cross-validation, ROC curves, feature engineering, the role of regularization. Automating these tasks with hyperparameter optimization.
By the end of the course, students are expected to be able to:
- Explain how the concepts of generalization error and overfitting of training data are essential to the performance of a classification or regression model.
- Apply and use shrinkage and feature selection methods (e.g., Lasso, elastic nets).
- Perform k-fold cross validation and bootstrapping based on the training data.
- Evaluate the quality of a statistical model in order to do model/feature selection.
- Explain how the ROC is generated, and how the area under the ROC curve (AUC) can be used for comparing models.
- Diagnose/understand/address overfitting and underfitting.
Note: information on this page is preliminary and subject to change.