Analyzing the census data and predicting whether the income exceeds $50K per year. Followed end to end modelling process involving the following steps:
- Performing Exploratory Data Analysis and establishing hypothesis of the data.
- Testing for Multi col-linearity and removing the outliers.
- Creating training and validation datasets using Stratified Random Sampling (SRS) of data.
- Fitting Classification model on training set using Logistic Regression & Decision Tree.
- Performing the validation of the models on Test data by ROC curve, Confusion Matrix
- Freezing the final model by the performance of the two models.