Notebooks used during the event about data science created by Alura.
The Imersão Dados (Immersion Data) was a event about data science and machine learning. In this event ENEM (National High School Exam) was used for data analysis. All the event ocurred in 5 days where each day was about a data science project.
-
The first meeting took place with Python and the famous Pandas, several resources from this library were used to explore educational data. In this analysis of the data, we sought to discover curiosities about those enrolled in ENEM, formulate hypotheses and Box-plot distribution graphs to better understand some aspects of Brazilian education.
-
On this day there was an advance in data analysis and data visualization was discussed, from good practices to the use of a new tool for generating more stylized graphics (seaborn).
-
In this class, heat map graphs and histograms were covered to understand the data. The heat map was used to observe the correlation between the scores. Otherwise, with histgrams and boxplot it was possible to understand the distributions and outlies in the scores.
-
It was built our first Machine Learning model, understand the difference between regression and classification problems and how to evaluate our model, using the Scikit-Learning library.
-
Let's dive deeper into the world of machine learning, discuss techniques, such as cross-validation, that help increase our confidence in the results of machine learning models and show you the much-feared overfit happening.
This project is licensed under the MIT License - see the LICENSE file for details.