Short Description

How to find groups and other structure in unlabeled, possibly high dimensional data. Dimension reduction for visualization and data analysis. Clustering, association rules, model fitting via the EM algorithm.

Learning Outcomes

By the end of the course, students are expected to be able to:

  1. Explain, with examples, the key differences between a supervised and an unsupervised learning problem.
  2. Apply successfully K-means and K-medoids, hierarchical and model-based clustering, including the EM algorithm.
  3. Explain and apply appropriately the following dimension-reduction methods: principal components, factor analysis and multidimensional scaling. Explain their differences and similarities.
  4. Explain and apply appropriately different matrix decompositions, including Singular Value Decomposition, Cholesky Decomposition, QR and LU.
  5. Apply and correctly interpret relevant visualization tools to the analysis (e.g., heatmaps and dendrograms).

Instructor (2016-2017)

Note: information on this page is preliminary and subject to change.