This course covers a wide variety of topics in machine learning and statistical modeling. While mathematical methods and theoretical aspects will be covered, the primary goal is to provide students with the tools and principles needed to solve the data science problems found in practice. This course also serves as a foundation on which more specialized courses and further independent study can build.
This course was designed as part of the core curriculum for the Center for Data Science's Masters degree in Data Science. Other interested students who satisfy the prerequisites are welcome to take the class as well. Note that class is intended as a continuation of DS-GA-1001 Intro to Data Science, which covers some important, fundamental data science topics that may not be explicitly covered in this DS-GA class (e.g. data cleaning, cross-validation, and sampling bias).
This term we will be using Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates, the TA, graders, and the instructor. Rather than emailing questions to the teaching staff, you are encouraged to post your questions on Piazza. If you have any problems or feedback for the developers, email team@piazza.com.
Other information:
Grading
Homework (40%) + One-Hour Test (15%) + Two-Hour Test (25%) + Project (20%)
Many homework assignments will have problems designated as “optional”. At the end of the semester, strong performance on these problems may lift the final course grade by up to half a letter grade (e.g. B+ to A- or A- to A), especially for borderline grades. You should view the optional problems primarily as a way to engage with more material, if you have the time. Along with the performance on optional problems, we will also consider significant contributions to Piazza and in-class discussions for boosting a borderline grade.