This project showcases the usage of machine learning techniques to predict survival outcomes on the Titanic dataset. The combination of exploratory data analysis, feature engineering, and model selection through cross-validation ensures an ideal performance evaluation.
The project implements a supervised learning flow on the Titanic dataset, which is divided into training and testing sets. The goal is to classify survival outcomes using machine learning techniques.
- Exploratory Data Analysis (EDA):
- Visualizing data to understand patterns and relationships.
- Performing feature engineering to create meaningful features for modeling.
- Model Training:
- Using various classification algorithms such as KNN and Naive Bayes to train models.
- Using 5-fold cross-validation with grid search to select optimal hyperparameters.
- Evaluation:
- Assessing model performance on the test set to demonstrate its quality.
- Evaluating metrics such as accuracy, precision, recall, and F1-score.
- Programming Languages: Python
- Libraries: Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn
- Techniques: Supervised learning, feature engineering, cross-validation, grid search, etc'