Below are assignment due dates and a record of the topics we covered each week this semester, updated after each week. See syllabus for planned topics.
Assignment |
Due |
1 (LOOCV) |
TBA |
2 (classification, KNN) |
TBA |
3 (decision trees, bagging) |
TBA |
4 (boosting and neural networks) |
TBA |
Individual project presentations (finals week, TBA) |
TBA |
Individual project code or paper |
TBA |
Lead |
Week |
Day |
Date.... |
Paper |
TBA |
TBA |
TBA |
TBA |
TBA |
|
|
|
|
|
- Intro to class
- What is machine learning?
- Algorithms in data science: model, training, inference
- Statistical inference: accuracy of a trained model
- Regression and classification
- Machine learning in a nutshell, all of it!
- Polynomial model algorithm
- Optimization training algorithms
- minimizing the training error
- Cross-validation (CV) inference algorithm
- k-fold CV for regression
- train-test split
- mean squared error
- Tuning parameters
- Model algorithms
- Smoothing splines
- k nearest neighbors (KNN), regression and classification
- Training algorithm
- regularization: penalized least squares
- CV inference algorithm
- k-fold CV for classification
- error rate
- Theory: bias-variance tradeoff
- Model algorithms
- Decision tree models
- Ensemble algorithms: bagging
- Training algorithms
- Recursive binary partitioning for decision trees
- Inference algorithms
- Tuning decision trees with CV
- Model algorithms
- Ensemble algorithms: bagging, random forests
- Inference algorithms
- Tuning bagging with CV
- Tuning random forests with "out of bag" CV
- Explainable machine learning: variable importance
- Parallel processing
- Using a compute server
- Model algorithms
- Ensemble algorithm: boosted trees
- Training algorithms
- Gradient descent
- Stochastic gradient descent
- Gradient boosting
- Stochastic gradient boosting
- Extreme gradient boosting, XGBoost
- Inference algorithms
- Tuning strategies for multiple hyperparameters
- One hot encoding for categorical variables
- Model algorithms
- Single layer neural networks
- Architectures for regression, classification, multifunction
- Deep learning: multilayer neural networks
- Wide vs deep, expressiveness
- Training algorithm
- Mini-batch stochastic gradient descent
- Using Keras library
- Model algorithms
- Deep learning: convolutional neural networks
- Training algorithms
- Dropout regularization
- Train-validate-test split
- Generalization & importance of data
- Scope of inference
- Test set leakage
- Transfer learning, pre-trained models
- Reading and discussion: contemporary and emerging applications in ecology
- Individual project