If you need help please check the discussion board on Canvas!. We check it regularly to answer common questions on projects and homeworks. The solution to your question might already be there!
Students are reminded to make use of office hours. Please reach out to any of the course staff whenever you need and we can make appointments to meet if you require it.
Getting Started with Python:
- Code Academy Python
- Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython Wes McKinney, O’Reilly Media, 2017. Code and notebooks but not text available on GitHub
- Data Science from Scratch: First Principles with Python, Joel Grus. O’Reilly Media, 2015. Code but not text available on GitHub
- Introduction to Machine Learning with Python: A Guide for Data Scientists, Andreas C. Müller and Sarah Guido, O’Reilly Media, 2016. Code but not text available on GitHub
These books come from a more statistical background and are mainly taught in R, however, they are considered to be some of the best texts for Statistics and Data Science. The first is an introduction, the second is appropiate for a graduate course.
- An Introduction to Statistical Learning with Applications in R. Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. Springer. Available online for free at the authors website.
-
The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Springer. Available online for free at the authors website.
-
A good Book on Data Mining in Java!
- Another good book on probability for Data Science.
Some Fun readings about Data Science and some key figures.
- John W. Tukey: His Life and Professional Contributions. David R. Brillinger. The Annals of Statistics Vol 30, No. 6, 2002.
-
50 Years of Data Science. David Donoho. Manuscript Based on Invited Talk, Princeton 2015.
- A Primer on PCA so you should know what a matrix is and you should be comfortable with everything in CMPS/MATH 2170).
A large debt for this course is owed to John P. Dickerson at UMD and his course (http://jpdickerson.com/).
- John P. Dickerson’s DS Class at UMD
- Dennis Sun’s DS Course at CalPoly
- DSC80 and DSC10 at UCSD:
- Data8 Resources (Berkeley Data Science Course)
- Other Berkeley Resources:
- Course and textbook from Alan Downey at Olin College.
- Zico Kolter’s course at Carnegie Mellon University
- Setting up a simple GitHub Pages website with Markdown
- Some random nice websites with tutorials and examples for DS.
- Two nice courses at Berkeley that deal with Data Science and Ethics.
- A couple of Data Science and Machine Learning interview questions. The course covers almost the entire set of DS questions (minus the R questions).