## About this document

This document is a compilation of textbooks referenced in MDS courses. The original version was created by an MDS student, Talha Siddiqui.

Textbook | Author(s) | Year | Course Code | Course Name | Comments |
---|---|---|---|---|---|

A Course in Machine Learning | Hal Daumé III | 2017 | DSCI 571, 572, 573, 563, 575 | Supervised Learning I, Supervised Learning II, Feature and Model Selection, Supervised Learning II, Advanced Machine Learning | |

Advanced R | Hadley Wickham | 2014 | DSCI 511 | Programming for Data Science | This is a prominent resource for R as a programming language, allowing the reader to dig deep into R. It anticipates readers to already have some programming background. Its first part on Foundations is closely aligned with the objectives of DSCI 511, and is therefore the textbook for the second half of the course. Gaining familiarity with this book will likely be an asset in your data science career. |

Algorithm Design | John Kleinberg and Eva Tardos | 2005 | DSCI 512 | Algorithms and Data Structures | |

Algorithm Design: Foundations, Analysis, and Internet Examples | Michael Goodrich and Roberto Tamassia | 2001 | DSCI 512 | Algorithms and Data Structures | |

Algorithms | Sanjoy Dasgupta, Christos Papadimitriou and Umesh Vazirani | 2006 | DSCI 512 | Algorithms and Data Structures | |

An Introduction to Statistical Learning: with Applications in R | James, Gareth; Witten, Daniela; Hastie, Trevor; and Tibshirani, Robert | 2014 | DSCI 561, 563, 572, 573 | Regression I, Unsupervised Learning, Supervised Learning II, Feature and Model Selection | For 561: Especially Chapter 3, A modern and approachable take on statistics / machine learning. For 573: Chapter 2: Statistical Learning, Chapter 5: Resampling Methods, Chapter 6: Linear Model Selection and Regularization, Chapter 7: Moving Beyond Linearity |

Art of Data Science | Roger Peng & Elizabeth Matsui | 2016 | DSCI 522 | Data Science Workflows | |

Artificial intelligence: A Modern Approach, 3rd Edition | Russell, Stuart and Peter Norvig | 2009 | DSCI 571, 572 | Supervised Learning I, Supervised Learning II | |

Artificial Intelligence: Foundations of Computational Agents, second edition | David Poole and Alan Mackworth | 2017 | DSCI 571, 572 | Supervised Learning I, Supervised Learning II | |

Bayesian Data Analysis | Andrew Gelman, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Donald Rubin | DSCI 553 | Statistical Inference and Computation II | ||

Data Analysis and Visualization Using R | David Robinson | 2014 | DSCI 511 | Programming for Data Science | |

Data Wrangling with Python: Tips and Tools to Make Your Life Easier | Jacqueline Kazil, Katharine Jarmul | 2016 | DSCI 523 | Data Wrangling | |

Database Management Systems, 3rd Edition | Ramakrishnan, Raghu and Gehrke, Johannes | 1996 | DSCI 513 | Databases and Data Retrieval | |

Deep Learning | Ian Goodfellow and Yoshua Bengio and Aaron Courville | 2016 | DSCI 572 | Supervised Learning II | |

Deep Learning With Python | Jason Brownlee | DSCI 572 | Supervised Learning II | ||

Dive into Deep Learning | Aston Zhang, Zack C. Lipton, Mu Li, Alex J. Smola | DSCI 572 | Supervised Learning II | ||

Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models, Second Edition | Julian J. Faraway | 2016 | DSCI 562 | Regression II | |

ggplot2 Elegant Graphics for Data Analysis | Hadley Wickham | 2009 | DSCI 531 | Data Visualization I | Readable, comprehensive resource for learning about ggplot2, by the main author of the ggplot2 package, Hadley Wickham. |

Grokking Deep Learning | Andrew Trask | 2019 | DSCI 572 | Supervised Learning II | |

Hands-On Programming with R | Garrett Grolemund | 2014 | DSCI 511 | Programming for Data Science | |

Houston, We Have a Narrative: Why Science Needs Story | Randy Olson | 2015 | DSCI 542 | Communication and Argumentation | Writing & Speaking |

Information Theory, Pattern Recognition and Neural Networks | David J.C. MacKay | 2003 | DSCI 563 | Unsupervised Learning | Chapters 20-22 |

Introduction to Algorithms, 3rd edition | Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein | DSCI 512 | Algorithms and Data Structures | ||

Introduction to Data Mining | Pang-Ning Tan, Michael Steinbach, Vipin Kumar | 2005 | DSCI 572 | Supervised Learning II | |

Introduction to Empirical Bayes: Examples from Baseball Statistics | David Robinson | DSCI 553 | Statistical Inference and Computation II | ||

Introduction to Machine Learning with Python: A Guide for Data Scientists | Andreas C. Mueller and Sarah Guido | 2016 | DSCI 571 | Supervised Learning I | |

Introductory Time Series with R | Cowpertwait, P. and Metcalfe, A. | 2009 | DSCI 574 | Spatial and Temporal Models | A great hands-on approach to time series modelling |

Linear Models with R | Julian James Faraway | 2005 | DSCI 561 | Regression I | Comprehensive book on linear models. |

Machine Learning: A Probabilistic Perspective | Kevin Murphy | 2012 | DSCI 572 | Supervised Learning II | |

Mathematics for Machine Learning | Marc Peter Deisenroth, A Aldo Faisal, and Cheng Soon Ong | 2018 | DSCI 572 | Supervised Learning II | |

Mining of Massive Datasets 2nd Edition | Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman | 2014 | DSCI 572 | Supervised Learning II | |

Modern Dive: An Introduction to Statistical and Data Sciences | Chester Ismay and Albert Y. Kim | 2018 | DSCI 552 | Statistical Inference and Computation I | |

Neural Networks and Deep Learning | Michael A. Nielsen | 2018 | DSCI 572 | Supervised Learning II | |

OpenIntro Statistics | David M Diez, Christopher D Barr, Mine C ̧etinkaya-Rundel | 2010 | DSCI 552, 561 | Statistical Inference and Computation I, Regression I | Fairly accessible, seems to lean towards a traditional approach. Chapters 7 & 8 are relevant for linear regression |

Pattern Recognition and Machine Learning | Christopher Bishop | 2007 | DSCI 572 | Supervised Learning II | |

Probabilistic Programming and Bayesian Methods for Hackers | Cam Davidson-Pilon | DSCI 553 | Statistical Inference and Computation II | ||

Python Data Science Handbook | Jake VanderPlas | 2016 | DSCI 511 | Programming for Data Science | |

Python for Computational Science and Engineering | Hans Fangohr | 2016 | DSCI 511 | Programming for Data Science | |

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython | Wes McKinney | 2011 | DSCI 511 | Programming for Data Science | |

R for Data Science (r4ds) | Garrett Grolemund & Hadley Wickham | 2016 | DSCI 531, 542, 561 | Data Visualization I, Communication and Argumentation, Regression I | For 531: Overall good book on using R for data science – including data vis, of course! For 542: Tools & Technology Chapters 26-30. For 561: Especially Part IV, Practical and approachable book on the use of R for data science. |

R Graphics Cookbook | Winston Chang | 2012 | DSCI 531 | Data Visualization I | Good as a reference if you want to learn how to make a specific type of plot in ggplot2. |

Spatio-Temporal Methods in Environmental Epidemiology | Shaddick, Gavin and Zidek, James V. | 2016 | DSCI 574 | Spatial and Temporal Models | A less detailed treatment of time series analysis as it is not a primary focus of the book. A good addition in terms of examples to the lecture notes. Chapters 10.3, 10.4, 10.6 |

Statistical Rethinking: A Bayesian Course with Examples in R and Stan (& PyMC3 & brms too) | Richard McElreath | DSCI 553 | Statistical Inference and Computation II | ||

Survival analysis: a self-learning text, 3rd edition | David G. Kleinbaum, Mitchel Klein | 2012 | DSCI 562 | Regression II | Non-technical explanation of survival analysis, with a nice succinct summary along the side of each page. Recommends epidemiological background, but we will avoid those parts. |

The Analysis of Time Series: An Introduction | Chatfield, Chris | 2003 | DSCI 574 | Spatial and Temporal Models | Chapters 1-5 A very readable introduction to time series analysis, without heavy mathematics |

The Art of Computer Programming, Volume 1-4 | Donald E. Knuth | DSCI 512 | Algorithms and Data Structures | ||

The Elements of Statistical Learning. Second Edition | Hastie, T., Tibshirani, R. and Friedman, J. | 2009 | DSCI 563 | Unsupervised Learning | |

The Psychology of Persuasion | Robert Cialdini | 1984 | DSCI 542 | Communication and Argumentation | Persuasion: Influence |

The Sense of Style | Steven Pinker | 2014 | DSCI 542 | Communication and Argumentation | Writing |

Think Python: How to Think Like a Computer Scientist | Allen B. Downey | 2002 | DSCI 511 | Programming for Data Science | Standard textbook for introductory programming courses. It includes case studies and exercises. |

Thinking, Fast and Slow | Daniel Kahneman | 2011 | DSCI 542 | Communication and Argumentation | Heuristics & Biases |

Visualization Analysis and Design | Tamara Munzner | 2014 | DSCI 531, 532 | Data Visualization I, Data Visualization II | The go-to book for data vis theory |