GitHub - usr-av/mlcourse.ai: Open Machine Learning Course

mlcourse.ai – Open Machine Learning Course

Current session: February 11th - April 26th, 2019. You can join at any point, fill in this form to participate, plese explore the main page mlcourse.ai as well.

Mirrors (:uk:-only): mlcourse.ai (main site), Kaggle Dataset (same notebooks as Kernels)

Outline

This is the list of published articles on medium.com 🇬🇧, habr.com 🇷🇺, and jqr.com 🇨🇳. Icons are clickable. Also, links to Kaggle Kernels (in English) are given. This way one can reproduce everything without installing a single package.

Exploratory Data Analysis with Pandas 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
Visual Data Analysis with Python 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2
Classification, Decision Trees and k Nearest Neighbors 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
Linear Classification and Regression 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2, part3, part4, part5
Bagging and Random Forest 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2, part3
Feature Engineering and Feature Selection 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
Unsupervised Learning: Principal Component Analysis and Clustering 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
Vowpal Wabbit: Learning with Gigabytes of Data 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
Time Series Analysis with Python, part 1 🇬🇧 🇷🇺 🇨🇳. Predicting future with Facebook Prophet, part 2 🇬🇧, 🇨🇳 Kaggle Kernels: part1, part2
Gradient Boosting 🇬🇧 🇷🇺, 🇨🇳, Kaggle Kernel

Lectures

Videolectures are uploaded to this YouTube playlist. Introduction, video, slides

Exploratory data analysis with Pandas, video
Visualization, main plots for EDA, video
Decision trees: theory and practical part
Logistic regression: theoretical foundations, practical part (baselines in the "Alice" competition)
Emsembles and Random Forest – part 1. Classification metrics – part 2. Example of a business task, predicting a customer payment – part 3
Linear regression and regularization - theory, LASSO & Ridge, LTV prediction - practice
Unsupervised learning - Principal Component Analysis and Clustering
Stochastic Gradient Descent for classification and regression - part 1, part 2 TBA
Time series analysis with Python (ARIMA, Prophet) - video
Gradient boosting: basic ideas - part 1, key ideas behind Xgboost, LightGBM, and CatBoost + practice - part 2

Spring 2019 assignments

Exploratory Data Analysis (EDA) of US flights, nbviewer. Deadline: February 24, 20:59 GMT
In Assignment 2, you'll be beating baselines in first two competitions:
- Part 1. User Identification with Logistic Regression (beating baselines in the "Alice" competition), nbviewer. Deadline: March 10, 20:59 GMT
- Part 2. Predicting Medium articles popularity with Ridge Regression (beating baselines in the "Medium" competition), nbviewer. Deadline: March 10, 20:59 GMT

Demo assignments, just for practice

The following are demo versions. Full versions are announced during course sessions.

Exploratory data analysis with Pandas, nbviewer, Kaggle Kernel, solution
Analyzing cardiovascular disease data, nbviewer, Kaggle Kernel, solution
Decision trees with a toy task and the UCI Adult dataset, nbviewer, Kaggle Kernel, solution
Sarcasm detection, Kaggle Kernel, solution. Linear Regression as an optimization problem, nbviewer, Kaggle Kernel
Logistic Regression and Random Forest in the credit scoring problem, nbviewer, Kaggle Kernel
Exploring OLS, Lasso and Random Forest in a regression task, nbviewer, Kaggle Kernel, solution
Unsupervised learning, nbviewer, Kaggle Kernel
Implementing online regressor, nbviewer, Kaggle Kernel, solution
Time series analysis, nbviewer, Kaggle Kernel, solution
Beating baseline in a competition, Kaggle kernel

Kaggle competitions

Catch Me If You Can: Intruder Detection through Webpage Session Tracking. Kaggle Inclass
How good is your Medium article? Kaggle Inclass

Rating

Throughout the course we are maintaining a student rating. It takes into account credits scored in assignments and Kaggle competitions. They say, rating highly motivates to finish the course. Top students (according to the final rating) are listed on a special page.

Community

Discussions between students are held in the #mlcourse_ai channel of the OpenDataScience Slack team. Fill in this form to get an invitation (you can join at any point before the course ends ~ in the end of April 26, 2019). The form will also ask you some personal questions, don't hesitate 👋

The course is free but you can support organizers by making a pledge on Patreon (monthly support) or a one-time payment on Ko-fi. Thus you'll foster the spread of Machine Learning in the world!

Name		Name	Last commit message	Last commit date
Latest commit History 2,713 Commits
data		data
docker_files		docker_files
img		img
jupyter_english		jupyter_english
jupyter_russian		jupyter_russian
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
run_docker_jupyter.sh		run_docker_jupyter.sh
run_docker_jupyter_windows.cmd		run_docker_jupyter_windows.cmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mlcourse.ai – Open Machine Learning Course

Outline

Lectures

Spring 2019 assignments

Demo assignments, just for practice

Kaggle competitions

Rating

Community

About

Releases

Packages

Languages

License

usr-av/mlcourse.ai

Folders and files

Latest commit

History

Repository files navigation

mlcourse.ai – Open Machine Learning Course

Outline

Lectures

Spring 2019 assignments

Demo assignments, just for practice

Kaggle competitions

Rating

Community

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages