This toolkit provides some examples of machine learning projects for fiscal management. The examples include supervised learning methods for classification and regression as well as unsupervised methods for anomaly detection. The code is in R and Python but can be easily adapted to other programming languages. A fake generated dataset is used for illustrative purposes in each of the algorithms.
The material presented in this repository includes presentations, videos, code, and theoretical documentation. All the material in this repository was prepared by Rodrigo Azuero, Cesar Montiel, and Ana Yarygina.
-
Introduction to machine learning
- What is Machine Learning?
- When is it useful?
-
Introduction to supervised learning
- Regression and classification
- Bayesian Classification
- Maximum Likelihood Estimation
- Gradient Descent
-
- Decision trees
- Regression single-tree models
- Random forest
- Boosting, bootstrap, bagging
-
Model selection and regularization
- Criteria for model and subset selection
- Regularization: LASSO, Ridge, and others.
- Overfitting
- K-fold cross-validation
-
- Neural networks topology
- Activation functions
- Cross-entropy cost minimization
- Parallelization
-
- K-means clustering
- Dimensionality reduction