#### Université Panthéon-Assas, Paris II

#### Instructor: **Amir Sani** (reachme@amirsani.com)

**LIKE our Facebook page.**

## Course Details

This is an applied course in Machine Learning intended for students of Economics and Finance. Course contents will be posted before each class.

*Programming Language:*

The official language of this course is Python 3. We will use Anaconda in class.

You are expected to know how to program in Python before taking this class.

*Grading:*

Grades will be based on individual quizzes and a team project.

### Materials

The following materials are useful for the course:

- The Elements of Statistical Learning: Data Mining, Inference and Prediction, **Python Notebooks**

- Applied Predictive Modeling

- Probability Theory Review

- Linear Algebra Review

## Course Schedule

**Class 1**: Foundation

- Slides
- Information Sheet
- Introduction to the Kaggle House Price Prediction Challenge
- Team Selection
- Review the algorithms in Scikit-Learn

#### Class Notebooks:

#### Quizzes:

- Class 1 Quiz
- Lab 1 Quiz (I've merged all the lab quizzes into 1.)

#### Read:

- An overview of gradient descent optimization algorithms
- Visualizing MNIST
- Ch. 19: Deviations from Lehman E, Leighton T, Meyer AR. Mathematics for computer science. Technical report, 2006. Lecture notes; 2010.

For a more theoretical treatment, see:

- Statistical Learning Theory Lecture Notes, Percy Liang (I highly recommend these notes if you are interested in learning the theory behind machine learning.)
- Statistical Learning Theory Lecture Notes, Rui Castro

**Class 2**: Feature Selection

#### Class Notebook:

#### Read:

- Princial Component Analysis Explained Visually
- Selecting good features Part I through IV
- Does your Data have a Signal?

**Class 3**: Algorithms and Analysis

By the end of this class, you should have setup a basic submission for the Kaggle challenge.

- Class 3 Quiz
- First Kaggle team submission due.
- Review the algorithms in Scikit-Learn

#### Class Notebook:

#### Read:

**Class 4**: Model Selection and Evaluation

#### Class Notebook:

#### Read:

- Kaggle Ensembling Guide
- The Dangers of Overfitting or How to Drop 50 spots in 1 minute
- A survey of cross-validation procedures for model selection
- Residual diagnostics
- Residual analysis to determine the optimal cutoff frequency
- Hacking the Random Walk Hypothesis

**Class 5**: Final Submissions

- Quiz 5
- Final Kaggle submission due on Kaggle and as Python Notebook (this is your team project) at 10h00
- Kaggle competition (class) leaderboard review at 11h00

Please submit your final project notebook via email (to reachme@amirsani.com) and complete this Final Project submission form.