If you need help please check the discussion board on Canvas!. We check it regularly to answer common questions on projects and homeworks. The solution to your question might already be there!

Students are reminded to make use of office hours. Please reach out to any of the course staff whenever you need and we can make appointments to meet if you require it.

Getting Started with Python:

  • Code Academy Python
  • Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython Wes McKinney, O’Reilly Media, 2017. Code and notebooks but not text available on GitHub
  • Data Science from Scratch: First Principles with Python, Joel Grus. O’Reilly Media, 2015. Code but not text available on GitHub
  • Introduction to Machine Learning with Python: A Guide for Data Scientists, Andreas C. Müller and Sarah Guido, O’Reilly Media, 2016. Code but not text available on GitHub

These books come from a more statistical background and are mainly taught in R, however, they are considered to be some of the best texts for Statistics and Data Science. The first is an introduction, the second is appropiate for a graduate course.

  • An Introduction to Statistical Learning with Applications in R. Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. Springer. Available online for free at the authors website.
  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Springer. Available online for free at the authors website.

  • A good Book on Data Mining in Java!

  • Another good book on probability for Data Science.

Some Fun readings about Data Science and some key figures.

A large debt for this course is owed to John P. Dickerson at UMD and his course (http://jpdickerson.com/).