Université Panthéon-Assas, Paris II
Instructor: Amir Sani (reachme@amirsani.com)
LIKE our Facebook page.
Course Details
This is an applied course in Machine Learning intended for students of Economics and Finance. Course contents will be posted before each class.
Programming Language:
The official language of this course is Python 3. We will use Anaconda in class.
You are expected to know how to program in Python before taking this class.
Grading:
Grades will be based on individual quizzes and a team project.
Materials
The following materials are useful for the course:
- The Elements of Statistical Learning: Data Mining, Inference and Prediction, Python Notebooks
- Applied Predictive Modeling
- Probability Theory Review
- Linear Algebra Review
Course Schedule
Class 1: Foundation
- Slides
- Information Sheet
- Introduction to the Kaggle House Price Prediction Challenge
- Team Selection
- Review the algorithms in Scikit-Learn
Class Notebooks:
Quizzes:
- Class 1 Quiz
- Lab 1 Quiz (I've merged all the lab quizzes into 1.)
Read:
- An overview of gradient descent optimization algorithms
- Visualizing MNIST
- Ch. 19: Deviations from Lehman E, Leighton T, Meyer AR. Mathematics for computer science. Technical report, 2006. Lecture notes; 2010.
For a more theoretical treatment, see:
- Statistical Learning Theory Lecture Notes, Percy Liang (I highly recommend these notes if you are interested in learning the theory behind machine learning.)
- Statistical Learning Theory Lecture Notes, Rui Castro
Class 2: Feature Selection
Class Notebook:
Read:
- Princial Component Analysis Explained Visually
- Selecting good features Part I through IV
- Does your Data have a Signal?
Class 3: Algorithms and Analysis
By the end of this class, you should have setup a basic submission for the Kaggle challenge.
- Class 3 Quiz
- First Kaggle team submission due.
- Review the algorithms in Scikit-Learn
Class Notebook:
Read:
Class 4: Model Selection and Evaluation
Class Notebook:
Read:
- Kaggle Ensembling Guide
- The Dangers of Overfitting or How to Drop 50 spots in 1 minute
- A survey of cross-validation procedures for model selection
- Residual diagnostics
- Residual analysis to determine the optimal cutoff frequency
- Hacking the Random Walk Hypothesis
Class 5: Final Submissions
- Quiz 5
- Final Kaggle submission due on Kaggle and as Python Notebook (this is your team project) at 10h00
- Kaggle competition (class) leaderboard review at 11h00
Please submit your final project notebook via email (to reachme@amirsani.com) and complete this Final Project submission form.