From 2f8e84840ebcb55bf6abba0dfe57c302cb65ad89 Mon Sep 17 00:00:00 2001 From: VARUNSHIYAM <138989960+Varunshiyam@users.noreply.github.com> Date: Tue, 5 Nov 2024 19:42:47 +0530 Subject: [PATCH 1/2] Fixes #776 --- ...a-ml-model-fight-predictions-ufc-259.ipynb | 1754 +++++++++++++++++ 1 file changed, 1754 insertions(+) create mode 100644 Prediction Models/MMA_Fight_prediction/mma-ml-model-fight-predictions-ufc-259.ipynb diff --git a/Prediction Models/MMA_Fight_prediction/mma-ml-model-fight-predictions-ufc-259.ipynb b/Prediction Models/MMA_Fight_prediction/mma-ml-model-fight-predictions-ufc-259.ipynb new file mode 100644 index 00000000..dc6ce51a --- /dev/null +++ b/Prediction Models/MMA_Fight_prediction/mma-ml-model-fight-predictions-ufc-259.ipynb @@ -0,0 +1,1754 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Mixed Martial Arts and the UFC\n", + "\n", + "\n", + "The UFC is the largest MMA promotion company in the world and features some of the highest-level fighters in the sport. As of 2020 the UFC has held over 500 events features fighters in 12 different weight divisions. The data set is a collection of over 5000 fights from the years 1993 to 2019.\n", + "\n", + "Being a huge fan of MMA, I wanted to design some Machine Learning Models to experiment with the avaiable data. The goal is to make a model to predict fight outcomes, and see if it has any usefulness in real world application.\n", + "\n", + "In this particular notebook I reduce the data down to (what I felt was) core stats, so despite this dataset having over 145 features, I reduce it down to height, weight, reach, win streak, lose streak, total wins, total losses, and total draws. In the future I will apply more features to see if the model accuracy improves at all.\n", + "\n", + "In this notebook I use the following algorithms for model building:\n", + "* Gaussian Naive Bayes\n", + "* Logistic Regression\n", + "* Decision Tree\n", + "* KNN\n", + "* Random Forest\n", + "* Support Vector Classifier\n", + "* XGBoost\n", + "* Artificial Neural Network\n", + "\n", + "The models with the highest accuracy score (Using k-fold cross-validation) on the training data are then accessed on the testing data.\n", + "\n", + "Finally the models that performed well are then applied to the upcoming event (March 6th 2021), to make predictions on fight winners." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19", + "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "import matplotlib.pyplot as plt\n", + "import seaborn as sns\n", + "\n", + "from sklearn.model_selection import train_test_split\n", + "\n", + "\n", + "\n", + "from sklearn.model_selection import KFold\n", + "from sklearn import tree\n", + "from sklearn.metrics import accuracy_score\n", + "from sklearn.metrics import confusion_matrix\n", + "\n", + "from sklearn.model_selection import StratifiedKFold\n", + "from sklearn.model_selection import cross_val_score\n", + "from sklearn.naive_bayes import GaussianNB\n", + "from sklearn.linear_model import LogisticRegression\n", + "from sklearn.neighbors import KNeighborsClassifier\n", + "from sklearn.ensemble import RandomForestClassifier\n", + "from sklearn.svm import SVC\n", + "import xgboost\n", + "from xgboost import XGBClassifier\n", + "\n", + "import keras \n", + "from keras.models import Sequential\n", + "from keras.layers import Dense\n", + "from keras import layers, models, optimizers\n", + "from sklearn.preprocessing import LabelEncoder\n", + "\n", + "from sklearn.metrics import confusion_matrix\n", + "from sklearn.metrics import plot_confusion_matrix\n", + "from sklearn.metrics import accuracy_score\n", + "\n", + "import os\n", + "for dirname, _, filenames in os.walk('/kaggle/input'):\n", + " for filename in filenames:\n", + " print(os.path.join(dirname, filename))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Import and clean data for use in models" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "import matplotlib.pyplot as plt\n", + "import seaborn as sns" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "data_df = pd.read_csv('../input/ufcdata/data.csv')" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 5144 entries, 0 to 5143\n", + "Columns: 145 entries, R_fighter to R_age\n", + "dtypes: bool(1), float64(134), int64(1), object(9)\n", + "memory usage: 5.7+ MB\n" + ] + } + ], + "source": [ + "data_df.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [], + "source": [ + "df = data_df.dropna()" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Int64Index: 3202 entries, 0 to 5008\n", + "Columns: 145 entries, R_fighter to R_age\n", + "dtypes: bool(1), float64(134), int64(1), object(9)\n", + "memory usage: 3.5+ MB\n" + ] + } + ], + "source": [ + "df.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [], + "source": [ + "columns=df.select_dtypes(include='object').columns" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Index(['R_fighter', 'B_fighter', 'Referee', 'date', 'location', 'Winner',\n", + " 'weight_class', 'B_Stance', 'R_Stance'],\n", + " dtype='object')" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "columns" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py:4174: SettingWithCopyWarning: \n", + "A value is trying to be set on a copy of a slice from a DataFrame\n", + "\n", + "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", + " errors=errors,\n" + ] + } + ], + "source": [ + "df.drop(columns=['R_fighter', 'B_fighter', 'Referee', 'date', 'location','weight_class'], inplace=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
WinnerB_StanceR_Stance
0RedOrthodoxOrthodox
1RedOrthodoxSouthpaw
2RedOrthodoxOrthodox
3BlueSwitchOrthodox
4BlueSouthpawSouthpaw
............
4887RedOrthodoxOrthodox
4901RedOrthodoxOrthodox
4923RedOrthodoxOrthodox
4967RedOrthodoxOrthodox
5008RedSouthpawOrthodox
\n", + "

3202 rows × 3 columns

\n", + "
" + ], + "text/plain": [ + " Winner B_Stance R_Stance\n", + "0 Red Orthodox Orthodox\n", + "1 Red Orthodox Southpaw\n", + "2 Red Orthodox Orthodox\n", + "3 Blue Switch Orthodox\n", + "4 Blue Southpaw Southpaw\n", + "... ... ... ...\n", + "4887 Red Orthodox Orthodox\n", + "4901 Red Orthodox Orthodox\n", + "4923 Red Orthodox Orthodox\n", + "4967 Red Orthodox Orthodox\n", + "5008 Red Southpaw Orthodox\n", + "\n", + "[3202 rows x 3 columns]" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df.select_dtypes(include='object')" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n", + "A value is trying to be set on a copy of a slice from a DataFrame.\n", + "Try using .loc[row_indexer,col_indexer] = value instead\n", + "\n", + "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", + " \n", + "/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:3: SettingWithCopyWarning: \n", + "A value is trying to be set on a copy of a slice from a DataFrame.\n", + "Try using .loc[row_indexer,col_indexer] = value instead\n", + "\n", + "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", + " This is separate from the ipykernel package so we can avoid doing imports until\n", + "/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:6: SettingWithCopyWarning: \n", + "A value is trying to be set on a copy of a slice from a DataFrame.\n", + "Try using .loc[row_indexer,col_indexer] = value instead\n", + "\n", + "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", + " \n" + ] + } + ], + "source": [ + "map_stance = {'Orthodox': 0, 'Switch': 1, 'Southpaw': 2, 'Open Stance': 3}\n", + "df['B_Stance'] = df['B_Stance'].replace(map_stance)\n", + "df['R_Stance'] = df['R_Stance'].replace(map_stance)\n", + "\n", + "map_winner = {'Red': 0, 'Blue': 1, 'Draw': 2}\n", + "df['Winner'] = df['Winner'].replace(map_winner)" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Int64Index: 3202 entries, 0 to 5008\n", + "Columns: 138 entries, Winner to R_age\n", + "dtypes: float64(134), int64(4)\n", + "memory usage: 3.4 MB\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py:4174: SettingWithCopyWarning: \n", + "A value is trying to be set on a copy of a slice from a DataFrame\n", + "\n", + "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", + " errors=errors,\n" + ] + } + ], + "source": [ + "df.drop(columns=df.select_dtypes(include='bool').columns, inplace=True)\n", + "df.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([0, 1, 2])" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df['Winner'].unique()" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": {}, + "outputs": [], + "source": [ + "df = df[df['Winner'] != 2]" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Int64Index: 3151 entries, 0 to 5008\n", + "Columns: 138 entries, Winner to R_age\n", + "dtypes: float64(134), int64(4)\n", + "memory usage: 3.3 MB\n" + ] + } + ], + "source": [ + "df.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": {}, + "outputs": [], + "source": [ + "X = df.drop(columns=['Winner'])\n", + "Y = df['Winner']" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": {}, + "outputs": [], + "source": [ + "from sklearn.model_selection import train_test_split\n", + "X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.25, random_state=42)" + ] + }, + { + "cell_type": "code", + "execution_count": 77, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Training size = 2363\n", + "Testing size = 788\n" + ] + } + ], + "source": [ + "print(\"Training size = \" + str(X_train.shape[0]))\n", + "print(\"Testing size = \" + str(X_test.shape[0]))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Model Training and Evaluation on Data using k-fold cross validation" + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "metadata": {}, + "outputs": [], + "source": [ + "seed = 404\n", + "np.random.seed(seed)" + ] + }, + { + "cell_type": "code", + "execution_count": 89, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Gaussian Naive Bayes K-fold Scores:\n", + "[0.60337553 0.59915612 0.6371308 0.51271186 0.59322034 0.55508475\n", + " 0.58898305 0.6059322 0.58050847 0.5720339 ]\n", + "\n", + "Gaussian Naive Bayes Average Score:\n", + "0.584813702352857\n", + "\n" + ] + } + ], + "source": [ + "from sklearn.model_selection import StratifiedKFold\n", + "from sklearn.model_selection import cross_val_score\n", + "from sklearn.naive_bayes import GaussianNB\n", + "\n", + "gnb = GaussianNB()\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(gnb, X_train, y_train.values.ravel(), cv=kfold)\n", + "gnb_score = cv_score.mean()\n", + "print('Gaussian Naive Bayes K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('Gaussian Naive Bayes Average Score:')\n", + "print(gnb_score)\n", + "print()" + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/opt/conda/lib/python3.7/site-packages/sklearn/linear_model/_logistic.py:765: ConvergenceWarning: lbfgs failed to converge (status=1):\n", + "STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.\n", + "\n", + "Increase the number of iterations (max_iter) or scale the data as shown in:\n", + " https://scikit-learn.org/stable/modules/preprocessing.html\n", + "Please also refer to the documentation for alternative solver options:\n", + " https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression\n", + " extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Logistic Regression K-fold Scores (training):\n", + "[0.6835443 0.67088608 0.62447257 0.65677966 0.63559322 0.69915254\n", + " 0.6779661 0.68220339 0.61864407 0.62288136]\n", + "\n", + "Logistic Regression Average Score:\n", + "0.6572123292569548\n" + ] + } + ], + "source": [ + "from sklearn.linear_model import LogisticRegression\n", + "\n", + "lr = LogisticRegression(max_iter = 10000)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(lr, X_train, y_train.values.ravel(), cv=kfold)\n", + "lr_score = cv_score.mean()\n", + "print('Logistic Regression K-fold Scores (training):')\n", + "print(cv_score)\n", + "print()\n", + "print('Logistic Regression Average Score:')\n", + "print(lr_score)" + ] + }, + { + "cell_type": "code", + "execution_count": 81, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Decision Tree K-fold Scores:\n", + "[0.5907173 0.56962025 0.55696203 0.55508475 0.58050847 0.63559322\n", + " 0.54661017 0.55932203 0.55508475 0.52966102]\n", + "\n", + "Decision Tree Average Score:\n", + "0.5679163984838732\n" + ] + } + ], + "source": [ + "from sklearn import tree\n", + "\n", + "dt = tree.DecisionTreeClassifier(random_state = 1)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(dt, X_train, y_train.values.ravel(), cv=kfold)\n", + "dt_score = cv_score.mean()\n", + "print('Decision Tree K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('Decision Tree Average Score:')\n", + "print(dt_score)" + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "KNN K-fold Scores):\n", + "[0.60337553 0.5443038 0.61603376 0.59322034 0.59322034 0.54661017\n", + " 0.56355932 0.61016949 0.58474576 0.58898305]\n", + "\n", + "KNN Average Score:\n", + "0.5844221554745047\n" + ] + } + ], + "source": [ + "from sklearn.neighbors import KNeighborsClassifier\n", + "\n", + "knn = KNeighborsClassifier()\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(knn, X_train, y_train.values.ravel(), cv=kfold)\n", + "knn_score = cv_score.mean()\n", + "print('KNN K-fold Scores):')\n", + "print(cv_score)\n", + "print()\n", + "print('KNN Average Score:')\n", + "print(knn_score)" + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Random Forest K-fold Scores:\n", + "[0.67510549 0.62869198 0.63291139 0.6440678 0.63559322 0.64830508\n", + " 0.65677966 0.68220339 0.62711864 0.6779661 ]\n", + "\n", + "Random Forest Average Score:\n", + "0.6508742759064579\n" + ] + } + ], + "source": [ + "from sklearn.ensemble import RandomForestClassifier\n", + "\n", + "rf = RandomForestClassifier(random_state = 1)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(rf, X_train, y_train.values.ravel(), cv=kfold)\n", + "rf_score = cv_score.mean()\n", + "print('Random Forest K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('Random Forest Average Score:')\n", + "print(rf_score)" + ] + }, + { + "cell_type": "code", + "execution_count": 84, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Support Vector Classification K-fold Scores:\n", + "[0.64978903 0.64978903 0.64978903 0.64830508 0.64830508 0.64830508\n", + " 0.65254237 0.65254237 0.65254237 0.65254237]\n", + "\n", + "Support Vector Classification Average Score:\n", + "0.6504451834370307\n" + ] + } + ], + "source": [ + "from sklearn.svm import SVC\n", + "\n", + "svc = SVC(probability = True)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(svc, X_train, y_train.values.ravel(), cv=kfold)\n", + "svc_score = cv_score.mean()\n", + "print('Support Vector Classification K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('Support Vector Classification Average Score:')\n", + "print(svc_score)" + ] + }, + { + "cell_type": "code", + "execution_count": 85, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[11:43:35] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n", + "[11:43:38] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n", + "[11:43:41] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n", + "[11:43:44] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n", + "[11:43:46] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n", + "[11:43:48] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n", + "[11:43:51] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n", + "[11:43:54] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n", + "[11:43:56] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n", + "[11:43:59] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n", + "XGBoost Classifier K-fold Scores:\n", + "[0.65400844 0.61181435 0.5907173 0.6440678 0.63559322 0.64830508\n", + " 0.65254237 0.66949153 0.62711864 0.65254237]\n", + "\n", + "XGBoost Classifier Average Score:\n", + "0.6386201101337339\n" + ] + } + ], + "source": [ + "import xgboost\n", + "from xgboost import XGBClassifier\n", + "\n", + "xgb = XGBClassifier(objective='binary:logistic',random_state =1, use_label_encoder=False)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(xgb, X_train, y_train.values.ravel(), cv=kfold)\n", + "xgb_score = cv_score.mean()\n", + "print('XGBoost Classifier K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('XGBoost Classifier Average Score:')\n", + "print(xgb_score)" + ] + }, + { + "cell_type": "code", + "execution_count": 86, + "metadata": {}, + "outputs": [], + "source": [ + "from keras.utils import np_utils\n", + "from sklearn.preprocessing import LabelEncoder\n", + "\n", + "encoder = LabelEncoder()\n", + "encoder.fit(y_train)\n", + "encoded_Y = encoder.transform(y_train)\n", + "y_Train = np_utils.to_categorical(encoded_Y)\n", + "\n", + "encoder = LabelEncoder()\n", + "encoder.fit(y_test)\n", + "y_Test = encoder.transform(y_test)" + ] + }, + { + "cell_type": "code", + "execution_count": 87, + "metadata": {}, + "outputs": [], + "source": [ + "import keras \n", + "from keras.models import Sequential\n", + "from keras.layers import Dense\n", + "# from keras import layers, models, optimizers\n", + "\n", + "\n", + "def create_model():\n", + " model = Sequential()\n", + " \n", + " model.add(Dense(X_train.shape[1], input_dim=X_train.shape[1], activation='relu'))\n", + " model.add(Dense(64, activation='tanh'))\n", + " model.add(Dense(128, activation='tanh'))\n", + " model.add(Dense(128, activation='tanh')) \n", + " model.add(Dense(32, activation='relu'))\n", + " model.add(Dense(2, activation='sigmoid'))\n", + "\n", + " model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])\n", + " return model" + ] + }, + { + "cell_type": "code", + "execution_count": 88, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Neural Network K-fold Scores:\n", + "[0.65400844 0.61181435 0.5907173 0.6440678 0.63559322 0.64830508\n", + " 0.65254237 0.66949153 0.62711864 0.65254237]\n", + "\n", + "Neural Network Average Score:\n", + "0.6386201101337339\n" + ] + } + ], + "source": [ + "from keras.wrappers.scikit_learn import KerasClassifier\n", + "from sklearn.model_selection import KFold\n", + "\n", + "seed = 7\n", + "np.random.seed(seed)\n", + "\n", + "model = KerasClassifier(build_fn=create_model, epochs=150, batch_size=10, verbose=0)\n", + "\n", + "\n", + "kfold = KFold(n_splits=10, shuffle=True)\n", + "results = cross_val_score(model, X_train, y_Train, cv=kfold)\n", + "nn_score = cv_score.mean()\n", + "print('Neural Network K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('Neural Network Average Score:')\n", + "print(nn_score)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Best performing models\n", + "\n", + "With the training accuracy in mind, we will grab the top 3 models and evaluate them on the testing set" + ] + }, + { + "cell_type": "code", + "execution_count": 92, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ModelScore Average
0Gaussian Naive Bayes0.584814
1Logistic Regression0.657212
2Random Forest0.650874
3Decision Tree0.567916
4K-Nearest Neighbor0.584422
5Support Vector Classifier0.650445
6XGBoost0.638620
7Neural Network0.638620
\n", + "
" + ], + "text/plain": [ + " Model Score Average\n", + "0 Gaussian Naive Bayes 0.584814\n", + "1 Logistic Regression 0.657212\n", + "2 Random Forest 0.650874\n", + "3 Decision Tree 0.567916\n", + "4 K-Nearest Neighbor 0.584422\n", + "5 Support Vector Classifier 0.650445\n", + "6 XGBoost 0.638620\n", + "7 Neural Network 0.638620" + ] + }, + "execution_count": 92, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "scores = [['Gaussian Naive Bayes', gnb_score],\n", + " ['Logistic Regression', lr_score],\n", + " ['Random Forest', rf_score],\n", + " ['Decision Tree', dt_score],\n", + " ['K-Nearest Neighbor', knn_score],\n", + " ['Support Vector Classifier', svc_score],\n", + " ['XGBoost', xgb_score],\n", + " ['Neural Network', nn_score]]\n", + "\n", + "df_scores = pd.DataFrame(scores,\n", + " columns = ['Model', 'Score Average']\n", + " )\n", + "df_scores" + ] + }, + { + "cell_type": "code", + "execution_count": 94, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAATgAAAEWCAYAAADy2YssAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAAmA0lEQVR4nO3debzUZd3/8debRVAWERFzAUEFzYXMBZfSkCyX28qs7jAtW0xpt/JOzTvNzOpnZVlpZeVti7uZaZq4pKklKZAbmCuogAu7gIjA+fz+uK6R4XDOnBmYOXNmeD8fj3mc73zXa+bMfOa6ru/1/X4UEZiZNaNu9S6AmVmtOMCZWdNygDOzpuUAZ2ZNywHOzJqWA5yZNa0NMsBJ+quk4+tdjrZIWiJp+3qXo5ko+T9JCyTdvx77OVDS49UsWz1I+oWkb9S7HJ2hUwKcpHGS/iVpqaSX8/RnJakzjt9aRBweEb+t9n4lfVxSSPqfVvNnShpTZtn6RsQzVS7XGEktOXgukTRL0tnVPEY1SDpU0t2SFkuaI+nvkt5bhV2/HXgXsG1EjF7XnUTEPRGxUxXKswZJw/LnZkqr+YMkvS5pRpn7+bikeztaLyLGR8Q561jchlLzACfpq8AFwPeBNwFbAuOBtwEb1fr4dTAfOFVS/3oXpJXZOXj2JX3hPyXpqDqX6Q2SPghcA/wO2Jb0OTkTeE8Vdr8dMCMillZhX7XUR9JuRc8/Akyv5gEkda/m/rq8iKjZA9gUWAp8oIP1/gv4N/AK8DzwzaJlY4CZrdafARySp0cDk/K2LwHn5/m9gT8A84CFwAPAlnnZXcAJeXoH4G95vbnAZcCAVsc6BXgYWARcBfRu53V8HLgXuBE4q2j+TGBMUXnvy2V6AfgZsFHRugHsCOwHvAh0L1r2fuDhPN0NOA14Opf9amBgO+Vq6z28Gvh60fML8nv/CjAZODDPfxPwKrB50bp7AXOAnvn5J4HHgAXABGC7PF/Aj4CX83v3MLBbG+UT8BzwPyU+I92A/wWezfv7HbBpXjYsv2/H5/3MBc7Iyz4FvAasApYAZxf+T632H8COefoIYBqwGJgFnNLW+wi8OX+WFgJTgfcWLbsUuBC4Ke/nX8AO7by2Qvn/F/h+0fxJwBmk4FyYV/ifL85lfH9RWYpf58KicvwcuJn0XTwkz/t2Xn4qMBHokZ9/Jr+WNj/jjfaodYA7DFhZePNKrDcG2D1/iEeRAtVRJb6cM1gd4O4DPpqn+wL75emTSIFmE6A76UvZPy+7i9UBbkdS86UXsAVwN/DjVse6H9gaGEj6Io9v53V8nBTg9sgf+oF5fnGA24sUvHrkD/ZjwMntfNGeBt5VtOwa4LQ8fXL+YG6by/5L4IoS72/xF3ME6Ys7tmjeccDmuVxfJQXX3nnZzcBnitb9EfDTPH0U8BTpC9aD9CX9Z152KClYDiAFsTcDW7VRvp3z6x5e4jPyyXyc7fP/+Trg93nZsLz9r4CNgbcAy4E3F/9fWv+fWu2/+H1/gdUBfjNgz9bvI9Azl+frpJbIWFLQ2Skvv5RUmx+d35fLgCvbeW2F8g8j/ch0z+/V46SAVBzgPkT6LHYDPkwKWluVeF2Xkn5c3pa36c2aAa4b6TP/zfy5WAC8tbMCUK0ftW6iDgLmRsTKwgxJ/5S0UNIySQcBRMRdEfFIRLRExMPAFcA7yjzGCmBHSYMiYklETCyavznpQ7sqIiZHxCutN46IpyLitohYHhFzgPPbOPZPImJ2RMwnBc09ShUoIh4EbiX9OrZeNjkiJkbEyoiYQQpM7b3WK4BjACT1I9UsrsjLTiLVUmZGxHLSB/SDknq0s6+t8/v+CvAEqUbxRn9NRPwhIublcv2QFDQL/U2/JQXAQhPnGOD3ReX4bkQ8lv/P3wH2kLQd6X/QjxTAlNd5oY2ybZ7/trWs4FhS7fyZiFgCnA6Ma/V6z46IZRHxEPAQKdCtixXALpL6R8SCiJjSxjr7kQLt9yLi9Yj4G/AX8v8ruy4i7s/vy2V08Lkh/RAWgtrxpFrqGiLimvxZbImIq4AnSUG0lD9HxD/yNq+12l8L8DHgi8ANwHkR8e8O9tcwah3g5gGDij+EEXFARAzIy7oBSNpX0p25Y3kRqY9uUJnH+BQwEviPpAckHZnn/57UXLpS0mxJ50nq2XpjSYMlXZk73l8hNWtbH/vFoulXSR/sjpwJfEbSm1odb6Skv0h6MR/vO20cr+By4GhJvYCjgSkR8Wxeth3wpxy0FpJqgqtIfVdtmR0RAyKiP6lGtYwUuArl+qqkxyQtyvvbtKhcfyZ94bcn1XYXRUThbOR2wAVF5ZhPqq1tk7/0PyM11V6SdHE7fZPz8t+t2ik7pFrLs0XPnyXVjIpf77r8n9ryAdKPybP5RMf+7ZTn+Rwgisu0zXqW53ekmtgxpM/iGiR9TNKDRe/3bnT8XXm+1ML8Q3snqQZ5YRllbBi1DnD3kZoK7+tgvctJvx5DImJT4BekLwmkKvgmhRVzDWKLwvOIeDIijgEGA/8PuFZSn4hYERFnR8QuwAHAkaRfqta+S2oejMpf/uOKjr3OIuI/pGbU11st+jnwH2BEPt7X2zteREwjfWkOJ3U4X160+Hng8By0Co/eETGrjLItyvt6D6ThD6Ta5n8Dm+UfoEWFcuVf/atJtaiPsrr2VijHSa3KsXFE/DNv+5OI2AvYlfRDtMYZ5uzxvJ8PlCj2bFIwLRhK6v54qaPX24bWn6k1foQi4oGIeB/pM3U96bW3VZ4hkoq/Q0NJTf/18UdSn/QzRT9mhXJuR2qGf57UJzoAeJTVn5/2bg1U8pZBko4A9gfuIJ0MbBo1DXARsZDUqXuRpA9K6iupm6Q9gD5Fq/YD5kfEa5JGk77MBU8AvSX9V66B/S+p+QSApOMkbZF/SRfm2askHSxp9xwQXyE1O1a1Ucx+5E5ZSdvQ9hdwXZ0NfIJUYyo+3ivAEkk7kzp1S7mc1Hw4iNQHV/AL4Nz8oUfSFpI6+iEhr9sXGEfqTC6UaSXpxEEPSWcCrWtahZrFe1mzZvEL4HRJu+Z9byrpQ3l6n1w770kKKoVO8DVERABfAb4h6ROS+ufPydslXZxXuwL4sqThufzfAa4q7v6owEPArpL2kNSb1LwvvDcbSTpW0qYRsYL0v2rrc/Ov/Jq+Jqmn0jCg9wBXrkN53hDpTO9Y4IQ2FvchBas5uayfINXgCl4CtpVU9ugESYOA3+TjHQ+8Jwe8plDzYSIRcR7pw/s10tmvl0j9TqcC/8yrfRb4lqTFpKbd1UXbL8rLf036dVxK6qsoOAyYKmkJ6UzguFzjeBNwLekD+hjwd9qo8pOC0J6kGstNpFpXVUTEdFJtpziYn0IK4ItJv8ZXdbCbK0id23+LiLlF8y8g1Xpvze/bRGDfEvvZWnkcHKlWOJBUI4PUlP8r6cfkWVIgWqNZExH/AFpIzeQZRfP/RKo5X5mb3I+SapyQguSvSB3Xz5Kaoj9oq3ARcS2p0/yTpNrRS8C3Sc1jgEtI7+XdpKETrwFfKPF62xURTwDfAm4n9WG1Hjv2UWBGfj3jyf2PrfbxOinYH046a3sR8LFcc18vETEpIp5uY/404IekltFLpBNz/yha5W+kH60XJc1tvX07Lib10d0cEfNIXT6/lrR5B9s1BKUfT7OOSfobcHlE/LreZTErhwOclUXSPsBtpH7SxfUuj1k5NshrUa0ykn5Las6d7OBmjcQ1ODNrWq7BmVnTam/Ue10MGtg9hg1ZayyudWHTXtii45Wsy3h98XxWLlu6XuM8Dz24T8yb39bImbVNfnj5hIg4bH2Otz66VIAbNqQn908YUu9iWAXeeu5n610Eq8CTV52/3vuYN38V908YWta63bd6stwrkmqiSwU4M+v6AmihpcP1ugIHODOrSBCsiPKaqPXmAGdmFXMNzsyaUhCsapDhZQ5wZlaxltI3KOkyHODMrCIBrHKAM7Nm5RqcmTWlAFa4D87MmlEQbqKaWZMKWNUY8c0Bzswqk65kaAwOcGZWIbFq/fMydQoHODOrSDrJ4ABnZk0ojYNzgDOzJtXiGpyZNaNGqsH5luVmVpFArKJbWY9SJA2RdKekxyRNlfSlPH8PSRMlPShpUk4GX9jmdElPSXpc0qEdldU1ODOrWJWaqCuBr0bEFEn9gMmSbgPOA86OiL9KOiI/HyNpF2AcsCuwNXC7pJER7d+czgHOzCoSiNej+/rvJ+IF4IU8vVjSY8A2pFZw/7zapsDsPP0+4MqIWA5Ml/QUMBq4r71jOMCZWUXSQN+ye7cGSZpU9PziiLi49UqShgFvBf4FnAxMkPQDUjfaAXm1bYCJRZvNzPPa5QBnZhWr4CTD3IjYu9QKkvoCfyQlFn9F0reBL0fEHyX9N/Ab4BBo86AlLxrzSQYzq0iEWBXdynp0RFJPUnC7LCKuy7OPBwrT15CaoZBqbMVp97ZldfO1TQ5wZlaxFlTWoxRJItXOHouI4nyGs4F35OmxwJN5+gZgnKRekoYDI4D7Sx3DTVQzq0g6yVCV0PE24KPAI5IezPO+DnwauEBSD+A14ESAiJgq6WpgGukM7OdKnUEFBzgzq1CFJxna30/EvbTdrwawVzvbnAucW+4xHODMrGKrfKmWmTWjwpUMjcABzswq1lLGGdKuwAHOzCqSLrZ3gDOzJhSIFVW4VKszOMCZWUUiKGsQb1fgAGdmFep4EG9X4QBnZhUJXIMzsybmkwxm1pQCOSeDmTWnlDawMUJHY5TSzLoQJ342syYV+EoGM2tirsGZWVOKUMPU4BqjlGbWZaSTDN3LepTSXl7UvOwLOffpVEnnFc13XlQzqyVVa6Bve3lRtySlCBwVEcslDQZwXlQzq7l0kmH9++BK5EX9NPC9nP+UiHg5b1JxXlQ3Uc2sYqvoVtajXK3yoo4EDpT0L0l/l7RPXm0b4PmizZwX1cyqq8IrGTpM/NxGXtQewGbAfsA+wNWStmcd8qI6wJlZxSpIOlMy8XM7eVFnAtdFRAD3S2oBBuG8qGZWaxGwoqVbWY9SSuRFvZ6UDxVJI4GNgLk4L6qZ1VpqolalbtReXtRLgEskPQq8Dhyfa3POi2pmtVeNKxk6yIt6XDvbOC9qZ3p5Vk++/6WhLHi5J+oWHHHcPN5/wlyentqbn542hGVLu7Hltq9z6oXP0qdfC3+7bjOuuWjwG9tPf6w3F054gh12W1bHV7FhOevIOzloxxnMX7oxH/rVOABGbjmXMw7/O716rGJVSze+c8uBTJ29JfsOf54vHjyRnt1bWLGqGz++Y38eeHbbOr+C+qrWMJHOUNMAJ+kw4AKgO/DriPheLY9XD917BCeeOZsRo5bx6pJufP6wkex50GJ+fMpQPn3mLEbtv5QJVwzk2p8P5vivvcjYoxcw9ugFQApu3/zEcAe3TnbjQztx1aTdOOc9d7wx7+Sx93HxPXvzj6e34+07PMvJYyfy6T+8j4Wv9ubkq49gzpI+7LDFPC465iYO/cnH6lj6rsCXaiGpO3AhcDiwC3BMHoncVDbfciUjRqUAtUnfFobsuJy5L/Rk5tO92H2/pQC89aDF3HvTgLW2vfP6zRhz1ILOLK4BU57fmkXLeq0xL0L02WgFAH17vc6cxZsA8PhLWzBnSR8Anp4zkI26r6Rn95LdPhuElpyXoaNHvdWyBjcaeCoingGQdCVpJPK0Gh6zrl58fiOefnRjdt7zVbbb6TXum9CfAw57hXv+MoA5s3uutf7dNwzgm/83vQ4ltdZ+cNvbuPCYv/DlQ/5JN8HHL33/WuscsvMzPP7SIFasaoyUebWSzqI2xntQy3pmWaOOJZ0oaZKkSXPmNe4v47Kl3TjnhGGM/9Ys+vRr4SvnP8eNlw7ic4eOZNmSbvTYaM3xiP+Zsgm9Nm5h2M6v1anEVuxDe03lh7cdwOE//Rg/uO0AzjryzjWWbz9oPl8cO5Fv3/yOOpWw6ygM9C3nUW+1DHBljTqOiIsjYu+I2HuLzRvjV6G1lSvgnBOGMfboBbz9iEUADB2xnO9e+QwXTniCMUctZKvtlq+xzV1/HuDmaRdy5O6Pc8fj2wNw22M7sOvWL7+xbHC/JZz/wVv4xg1jmblw03oVsUtplCZqLQNcxaOOG1EEnP/VoQwZsZwPnDTnjfkL56bWf0sLXH7Blhz50XlvLGtpgXv+MoAx71vY2cW1dsxZsgl7DU0fz9HDZvHc/BTI+vZazk8/fDM/vXNfHpq5VT2L2GUUzqI2Qg2uln1wDwAj8ojjWaTbnHykhseri6n39+GOawcy/M3L+MwhOwHwidNnM2t6L268dBAAbzt8Ee8eN/+NbR6Z2JdBW61gq+1er0uZN3TfPeo29tpuNgM2fo1bvvA7fnH3Ppxz0xj+59330qNbsHxld7598xgAxu39KEM2W8SnD5zMpw+cDMBnLj+SBa9uUsdXUH+NchZVaYBwjXYuHQH8mDRM5JI8SK9de7+ld9w/YUipVayLeeu5n613EawCT151Pq++/Px6Va0223lwjL3kg2Wte93bfj651LWotVbTcXARcTNwcy2PYWadrys0P8vhKxnMrCK+ksHMmpoDnJk1pQpveFlXDnBmVrGuMMatHA5wZlaRCFjZwc0suwoHODOrWKM0URsjDJtZl1Gta1FLJX7Oy0+RFJIGFc1z4mczq62oTg2uzcTPETFN0hDgXcBzhZXXJfGza3BmVrFqXGwfES9ExJQ8vRgoJH4G+BHwNda8QccbiZ8jYjpQSPzcLtfgzKwiERX1wXWYFxXWTPws6b3ArIh4KCXeesM2wMSi5078bGbVJlaVfxa1ZF5UWDPxM6nZegbw7jYPvDYnfjaz6qpSH9xaiZ8l7Q4MBwq1t22BKZJGsw63YHOAM7OKVOta1LYSP0fEI8DgonVmAHtHxFxJNwCXSzqfdJLBiZ/NrMoi9cNVQZuJn/NdiNY+bIQTP5tZ7VXjUq0OEj8X1hnW6rkTP5tZ7URlJxnqygHOzCpWwxuBV5UDnJlVrFpnUWvNAc7MKhLhAGdmTaxR7ibiAGdmFXMfnJk1pUC0+CyqmTWrBqnAOcCZWYV8ksHMmlqDVOEc4MysYg1fg5P0U0rE6Yj4Yk1KZGZdWgAtLQ0e4IBJJZaZ2YYqgEavwUXEb4ufS+oTEUtrXyQz6+oaZRxch4NZJO0vaRopIQSS3iLpopqXzMy6rijzUWfljNb7MXAoMA8gIh4CDqphmcysSxMR5T1K7qWdvKiSvi/pP5IelvQnSQOKtqkoL2pZw5Ej4vlWs0reRdPMmlx1anCFvKhvBvYDPpdzn94G7BYRo4AngNNhrbyohwEXSepe6gDlBLjnJR0AhKSNJJ1Cbq6a2QYoIFpU1qPkbtrJixoRt0bEyrzaRFJyGViHvKjlBLjxwOdI+QdnAXvk52a2wVKZj5QXtehxYpt7K8qL2mrRJ4G/5ultgOLW5PrnRY2IucCxHa1nZhuQ8k8gVJQXNSJeKZp/BqkZe1lhVqUlKecs6vaSbpQ0R9LLkv4safuOtjOzJlals6it86IWzT8eOBI4NuKNQSkV50Utp4l6OXA1sBUpF+E1wBVlbGdmzagw0LecRwlt5UXN8w8DTgXeGxGvFm1yAzBOUi9Jw6lSXlRFxO+Lnv9B0ufL2M7MmlQt86ICPwF6Abfl7PYTI2J8VfOiShqYJ++UdBpwJSl2fxi4aZ1fkpk1vipci1oiL2qbiZ/zNlXLizqZFNAKBTip+DjAOeUexMyai7rAVQrlKHUt6vDOLIiZNYguchlWOcq6H5yk3YBdgN6FeRHxu1oVysy6so5PIHQVHQY4SWcBY0gB7mbgcOBewAHObEPVIDW4coaJfBB4J/BiRHwCeAvpDIeZbahaynzUWTlN1GUR0SJppaT+wMuAB/qabaia4YaXRSbl25X8inRmdQkdDK4zs+bW8GdRCyLis3nyF5JuAfpHxMO1LZaZdWmNHuAk7VlqWeE2J2ZmXVWpGtwPSywLYGyVy8ITD2/CoVvvUe3dWg2tOLXeJbBKRMnbQ5av4ZuoEXFwZxbEzBpEUJVLtTqDEz+bWeUavQZnZtaehm+impm1q0ECXDl39JWk4ySdmZ8PlVQy0YOZNbkmyot6EbA/cEx+vhi4sGYlMrMuTVH+o97KCXD7RsTngNcAImIBsFFNS2VmXVuLynuUUCLx80BJt0l6Mv/drGibqid+XpGTq0Y+wBZ0ictozaxeqlSDay/x82nAHRExArgjP69Z4uefAH8CBks6l3SrpO+UsZ2ZNasq9MG1l/iZlOD5t3m13wJH5emKEz+Xcy3qZZImk26ZJOCoiHBme7MNVWX9a4MkTSp6fnFEXNx6pVaJn7eMiBcgBUFJg/Nq25Ay3Resf+JnSUOBV4Ebi+dFxHMdbWtmTaqGiZ9zJq02V620JOWMg7uJ1clnegPDgcdJ7WAz2wCpSr3w7SR+fknSVrn2thXpHpRQi8TPEbF7RIzKf0eQ2rz3VvpCzMyKtZf4mZTg+fg8fTzw56L5VU/8vIaImCJpn0q3M7MmUtvEz98Drpb0KeA54EMAVU38XCDpK0VPuwF7AnMqex1m1jSqNIi3ROJnSCc129qmaomfC/oVTa8k9cn9sdwDmFkT6gJXKZSjZIDLg+j6RsT/dFJ5zKwRNHqAk9QjIlaWunW5mW14RPXOotZaqRrc/aT+tgcl3QBcAywtLCw6pWtmG5IuciF9OcrpgxsIzCPlYCiMhwvAAc5sQ9UEAW5wPoP6KKsDW0GDvDwzq4kGiQClAlx3oC/rcHmEmTW3ZmiivhAR3+q0kphZ42iCANcYecHMrHNFc5xFbXMksZlZw9fgImJ+ZxbEzBpHM/TBmZm1zQHOzJpSF0kJWA4HODOriHAT1cyaWKMEuHKyapmZralKme0lXSLpZUmPFs3bQ9JESQ9KmiRpdNGyqudFNTNbU5UCHHApKcdpsfOAsyNiD+DM/LxmeVHNzFYrM+lzOc3YiLgbaD0kLYD+eXpTVieWqX5eVDOztVQ5L2orJwMTJP2AVAk7IM+vfl5UM7PWKrhUq8O8qG34DPDliPijpP8mZd46hHW48YebqGZWsWo1UdtxPKvvN3kNq5uh1c+Lama2hnJPMKx7gJsNvCNPjwWezNO1z4tqZlatKxkkXQGMIfXVzQTOAj4NXCCpB/AacCLUKC+qmVmxal7JEBHHtLNor3bWr3peVDOzNailMS5lcIAzs8r4Ynsza2aNci2qA5yZVc4BzsyalWtwZta8HODMrCk1SVYtM7O1+I6+ZtbcojEinAOcmVXMNbgNxFfOf459D1nMwrk9OGnsTgAceORCPvrVFxkyYjlfPGIETz68CQA9erbwpfNmMmLUMqIFfn7mNjx8X996Fn+DdM4hd3LQ8BnMf3Vj3n/ZOAB2GjSXb4y9m016rmD2K/04dcIhLH19IwBO2HsKR+/6GKtCfPeut/PP54bWs/j110ADfWt2N5G27rXejG69aiBnHDt8jXkz/tObb50wjEcm9llj/uHHphuXjn/nTpw2bntOPGs2apSfwiZy/bSdGH/9kWvMO/uQu/jxP/bj6Ms+zB1PD+cTez4IwPYD53P4yKd43x/GMf76I/nGwffQrVF62GtILeU96q2Wt0u6lLXvtd50Hv1XXxYvWLMi/PxTvZn5dO+11h068jX+fU8/ABbN68mSRd0Z+ZZlnVJOW23y7K1Z9FqvNeYNG7CQSbO2AuC+54bwrh2fAWDs9jP46xM7smJVd2a90p/nFm3K7lu+3Oll7mo2+ADXzr3WN2jPTN2Y/Q9dRLfuwZZDljNi1KtssfXr9S6WAU/NG8jB288A4N0jnuZN/ZYAMLjvUl5cvLob4aUlfRjcd2k9ith1BOkkQzmPOqt7H5ykE8n3e+rNJnUuTW1NuHIgQ0e8xs9ueYKXZ27EtEl9WLWqrbswW2f7xu0Hc/o77mX86EncNX0YK1al33610dlU/69t/TVKz0rdA1xOQHExQH8NbJC3bd20rBK//ObqHBk/uuFJZj3Tq8QW1lmmL9iME69/DwDbDVjIQcOeA+ClJX3fqM0BbNl3KXOW9GlzHxuUBvmm+pblnajXxi302jjdgHTPgxazaqV47sm1++qs8w3c+FUg1dhOGj2Zqx/ZBYA7nxnG4SOfomf3VWzT/xWGDljIIy8NrmdR664w0LcaORnaOxkp6Qs5ufNUSecVza8o8XPda3CN7rSLnmXU/kvYdOBK/jBpGr//4ZYsXtCDz357FptuvpJzfj+dp6f25oyP7MCAzVdy7hXPEC0w78WenPeFDXy4QZ2cd9ht7LPtbAb0fo3bP/k7LvrXPmzScwXjRqXv2O1Pb8+fpu0MwNPzBzLhyR244bgrWRni3DsPpCU28HpBRDVveHkp8DPgd4UZkg4m5UAdFRHLJQ3O84sTP28N3C5pZKnblitq1BFYfK914CXgrIj4Talt+mtg7Kt31qQ8VhuzTj2g45Wsy5h+6fkse+H59er47Tdg23jrQV8qa917bvza5I7SBkoaBvwlInbLz68m5U+9vdV6pwNExHfz8wnANyPivvb2XbMaXIl7rZtZg6vgJMO6JH4eCRwo6VxS0plTIuIBnPjZzGougPKbqOuS+LkHsBmwH7APcLWk7VmHxM8OcGZWudqeRZ0JXBep/+x+SS2kri4nfjaz2qtxZvvrSQmfkTQS2AiYixM/m1lnqNZZ1HYSP18CXJKHjrwOHJ9rc078bGY1VsW7iZQ4GXlcO+s78bOZ1U4a6NsYlzI4wJlZ5brAnULK4QBnZhVzDc7MmlMD3dHXAc7MKlTVa1FrygHOzCrnJqqZNSUnfjazpuYanJk1rcaIbw5wZlY5tTRGG9UBzswqE3igr5k1JxEe6GtmTcwBzsyalgOcmTWlBuqD8x19zaxiamkp69HhftrJi5qXnSIpJA0qmldRXlQHODOrUKQmajmPjl0KHNZ6pqQhwLuA54rmFedFPQy4SFL3Ujt3gDOzygRVC3ARcTcwv41FPwK+xppDit8HXBkRyyNiOvAUMLrU/h3gzKxyLWU+cl7UoseJHe1a0nuBWRHxUKtF2wDPFz13XlQzq74KxsFVlBdV0ibAGcC721rcxjznRTWzKqvdMJEdgOHAQ5Ig5T6dImk065AX1QHOzCoTAatqM04kIh4BBheeS5oB7B0RcyXdAFwu6Xxga8rIi+o+ODOrXJVOMuS8qPcBO0maKelT7R8ypgKFvKi34LyoZlYTVWqilsiLWlg+rNVz50U1sxoKwDkZzKw5BURjXKvlAGdmlQlqdpKh2hzgzKxyvpuImTUtBzgza05lX0hfdw5wZlaZAJx0xsyalmtwZtacanepVrU5wJlZZQLC4+DMrGn5SgYza1rugzOzphThs6hm1sRcgzOz5hTEqpK3YesyfMNLM6tM4XZJ5Tw60FZeVEnfl/QfSQ9L+pOkAUXLnBfVzGosWsp7dOxS1s6LehuwW0SMAp4ATgfnRTWzThBAtERZjw731UZe1Ii4NSJW5qcTScllwHlRzazmIqpZg+vIJ4G/5mnnRTWz2qvgJMMgSZOKnl8cEReXs6GkM4CVwGWFWW0VpdQ+ulSAW8yCubfHtc/Wuxw1MAiYW+9C1MT3rq13CWqlWf9n263vDhazYMLtce2gMlefGxGt+9g6JOl44EjgnRFvjEmpOC+qokHGszQySZMqye5t9ef/WeeRNAz4S0Tslp8fBpwPvCMi5hSttytwOanfbWvgDmBEqdSBXaoGZ2YblpwXdQypKTsTOIt01rQXcFvObj8xIsZHxFRJhbyoKykjL6prcJ3AtYHG4/9Zc/BZ1M5RVqeqdSn+nzUB1+DMrGm5BmdmTcsBzsyalgNcDUk6LF8U/JSk0+pdHutYWxd/W+NygKuRfBHwhcDhwC7AMfliYevaLmXti7+tQTnA1c5o4KmIeCYiXgeuJF0sbF1YWxd/W+NygKudii8MNrPqcoCrnYovDDaz6nKAq52KLww2s+pygKudB4ARkoZL2oh0J9Ib6lwmsw2KA1yN5DuSfh6YADwGXB0RU+tbKutIvvj7PmAnSTMlfareZbJ150u1zKxpuQZnZk3LAc7MmpYDnJk1LQc4M2taDnBm1rQc4BqIpFWSHpT0qKRrJG2yHvu6VNIH8/SvS90IQNIYSQeswzFmSFor+1J781uts6TCY31T0imVltGamwNcY1kWEXvk7EOvA+OLF+Y7mFQsIk6IiGklVhkDVBzgzOrNAa5x3QPsmGtXd0q6HHhEUndJ35f0gKSHJZ0EoORnkqZJugkYXNiRpLsk7Z2nD5M0RdJDku7IKd3GA1/OtccDJW0h6Y/5GA9IelvednNJt0r6t6Rf0vb1uGuQdL2kyZKmSjqx1bIf5rLcIWmLPG8HSbfkbe6RtHNV3k1rSk4b2IAk9SDdZ+6WPGs0sFtETM9BYlFE7COpF/APSbcCbwV2AnYHtiSlXruk1X63AH4FHJT3NTAi5kv6BbAkIn6Q17sc+FFE3CtpKOlqjTeTUr7dGxHfkvRfwBoBqx2fzMfYGHhA0h8jYh7QB5gSEV+VdGbe9+dJyWDGR8STkvYFLgLGrsPbaBsAB7jGsrGkB/P0PcBvSE3H+yNiep7/bmBUoX8N2BQYARwEXJHzSM6W9Lc29r8fcHdhXxHR3n3RDgF2yTkrAfpL6pePcXTe9iZJC8p4TV+U9P48PSSXdR7QAlyV5/8BuE5S3/x6ryk6dq8yjmEbKAe4xrIsIvYonpG/6EuLZwFfiIgJrdY7go5v16Qy1oHUtbF/RCxroyxlX/snaQwpWO4fEa9Kugvo3c7qkY+7sPV7YNYe98E1nwnAZyT1BJA0UlIf4G5gXO6j2wo4uI1t7wPeIWl43nZgnr8Y6Fe03q2k5iJ5vT3y5N3AsXne4cBmHZR1U2BBDm47k2qQBd2AQi30I6Sm7yvAdEkfyseQpLd0cAzbgDnANZ9fk/rXpuTEKb8k1dT/BDwJPAL8HPh76w0jYg6p3+w6SQ+xuol4I/D+wkkG4IvA3vkkxjRWn809GzhI0hRSU/m5Dsp6C9BD0sPAOcDEomVLgV0lTSb1sX0rzz8W+FQu31R8G3grwXcTMbOm5RqcmTUtBzgza1oOcGbWtBzgzKxpOcCZWdNygDOzpuUAZ2ZN6/8DF01n8pAsYNoAAAAASUVORK5CYII=\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Gaussian Naive Bayes Model Accuracy (on testing set): \n", + "0.618020304568528\n" + ] + } + ], + "source": [ + "from sklearn.metrics import plot_confusion_matrix\n", + "from sklearn.metrics import accuracy_score\n", + "\n", + "GNB = GaussianNB()\n", + "GNB_model = GNB.fit(X_train, y_train.values.ravel())\n", + "y_pred = GNB_model.predict(X_test)\n", + "\n", + "disp = plot_confusion_matrix(GNB_model, X_test, y_test)\n", + "disp.ax_.set_title('Gaussian Naive Bayes Confusion Matrix')\n", + "\n", + "plt.show()\n", + "print('Gaussian Naive Bayes Model Accuracy (on testing set): ')\n", + "print(accuracy_score(y_test, y_pred))" + ] + }, + { + "cell_type": "code", + "execution_count": 96, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAATgAAAEWCAYAAADy2YssAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAAkzUlEQVR4nO3deZhU1Z3/8ffHlkVBBAQMmzsuqBGNIS6JQc0MaMyYzCS/YIxjJiYuQc3iTKImE42GaBKNZlwHjSNqlGA20aiojPuIuMQN3Ai4IAiyGECQpfv7++Oe1rLtrq6CLqrq8nk9z3361rn3nntuVde3znIXRQRmZnm0SbULYGZWKQ5wZpZbDnBmllsOcGaWWw5wZpZbDnBmllu5DnCSrpT0n+uw3TaSlktqqES5apWkOyQdW+1ylEvSTyUtlPTmeuSRi89c0pmSrq52OWqFauU8OEmvAN+IiHvqdd+Svgb8BlgJNAGzgR9GxG3rW8Z6J2k4cDZwANl7MxO4IiL+Zz3zHQy8BGwbEQvWt5yVICmABcDAiFib0jYF5gJ9I0Il5DECuCEiBlWwqLmT6xpclTwSEd2BnsDlwARJPTt6J/VU05C0P/C/wP3ATsBWwEnAYR2Q/bbAoloNbgXe5oPHeziwpCN3kIKmFYqImpiAV4DPtJLeBbiY7NdubprvUrD8+8C8tOwbQAA7pWXXAj9N832A28j+0RYDD5IF+OvJahQrgeUpv+1SPpumbXsD/5P2sQT4cxvH8DXgoYLXm6d8Pl5wLBcArwHzgSuBzco4liuA24F3gM8AA4A/AG+R1RZPLchrOPA4sDTt61cpvStwA7AovRePAVunZfeR1WRJ782PgFfJah/XAVumZc3vz7HpWBaS1VTb+mwfAi5r5/P/JlmtbjEwCRhQsCyAE4GX0/t/GaD0HjTXlpen92gEMKet/60i70vLz3xAKsfiVK5vFuR3NjAxvSfLgOnAvkWOLdJ7eXNB2u+BHwJRkPZvwPMpz1nACSm9W4vjXJ7Kd3bK54Z0PN9IaTek7b6c8umRXh8GvElWa6z6d36DxJVqF6C1f8IW6ecAU4F+QF/g/4Bz07JR6QPbnSyYXE/bAe48soDSKU2f4v0m+gf23co/+1+A3wG90rafbuMYvkYKcEADMAZYDfRLaRenL01vYAvgVuC8Mo7l78CBZMFnc+AJ4MdAZ2CH9M88Mq3/CHBMmu8O7JfmT0j73TyV8WMFX4D7eD/AfZ3si71D2v6PwPUt3p+rgM2AvYBVwG6tvCebA43AwUU++0PIguQ+ZD8ClwAPFCwPsh+nnsA2ZAF9VFo2goKA1vJ1y8+3yPvS8jO/n6wG3hUYlvZ5aFp2NvAuWS2sgex/a2qR4wtgD7KA2jNN81NaFKz3WWBHsuD9aWAFsE+R4zobWAN8Pv1PbEZBgEvr/Jbsf2crsh/OI6r9Xd+QUz00UY8GzomIBRHxFvAT4Ji07P8B/xMR0yNiRVrWljVAf7K+mjUR8WCk/4BiJPUn++U7MSKWpG3vL7LJfpLeJvsCXAB8NSIWSBJZLeW7EbE4IpYBPwNGl3Est0TEwxHRBOxJ9kt8TkSsjohZZAGnOb81wE6S+kTE8oiYWpC+FVngbIyIJyJiaSv7OpqsdjMrIpYDZwCjWzSDfhIRKyPiaeBpskDXUi+yL9+8Iu/Z0cA1EfFkRKxK+9pf0nYF65wfEW9HxGvAvWRBZ1209b68J/XrfRL4QUS8GxFPAVfz/v8dZD9kt0dEI9mPUWvHXuhdsh+WL5N9RpNS2nsi4i8R8bfI3A/cRfZDXMwjEfHniGiKiJWtLB9D9gNyH3BrbGT9wfUQ4AaQNZOavZrSmpe9XrCscL6lX5LVSO6SNEvS6SXufzCwOCJK7S+ZGhE9yb7Yk3j/H7QvqdYl6e0UBO9M6VDasRSmbQsMaM4r5XcmsHVafhywM/CCpMckHZHSrwcmk/UNzpX0C0mdWtlXa+/7pgX5Q1bjbLaCrEbU0hKyplX/Vpa1uq8UUBcBA8vcVynael9alqf5R6jZq+2Up2sJfWDXAf+aputaLpR0mKSpkhanz/Nwsq6VYor9zxMRbwM3k9UWL2wnr9yphwA3l+zL3GyblAZZraBwVGlwW5lExLKIOC0idgA+B3xP0qHNi4vs/3Wgd7kDBelL+i3gGEl7kzXBVgK7R0TPNG0Z2YBEqcdSWM7XgdkFefWMiC0i4vC0/5cj4iiypv3Pgd9L6pZqoD+JiKFkI5pHkH3hWmrtfV9L1rQq531YQdYs/Jciq31gX5K6kdUy3yhnX8k7ZD8kzXk18P6PSJvvSyvl6S1pi4K0bdaxPIUeJAv0W5P1S75HUhey/tQLyPpEe5L1tzaPsLb1P1q0FSJpGFl3w03Af61juetWrQW4TpK6Fkybkn0wP5LUV1Ifsj6nG9L6E4F/k7SbpM3TslZJOkLSTqmpuJSsX6gxLZ5P1tf0IRExD7gDuFxSL0mdJB1UysFExCKyps2PU7PyKuAiSf1SmQZKGlnusSTTgKWSfiBpM0kNkvaQ9PGU91cl9U37fTtt0yjpYEl7pi/+UrImW2Mr+d8EfFfS9pK6kzWnfxfpNIcyfR/4mqT/kLRVKt9ekiak5TemYx+Wvug/Ax6NiFfWYV8vkdWmPptqpj8i69cj7bfV96Uwg4h4nayv97z0f/hRsprfb9ehPIX5BtmP6z+10j3SOZXzLWCtpMOAfyxYPh/YStKWpe5PUvOA0plkAxgDJX1rPQ6h7tRagLudrJbTPJ0N/JRs1OsZ4FngyZRGRNxB9qt0L1nz85GUz6pW8h4C3EM2AvUIcHlE3JeWnUcWRN+W9O+tbHsMWSB4gWxE8TtlHNPFwOHpS/KDVM6pkpam8uyyDsdC6vv5HFlf1GyyGuLVQPMXYBQwXdJy4NfA6Ih4F/gI2cjbUrIRu/t5/wej0DVkzdkHUv7vAqeUcdyFZf0/sn6gQ4BZkhYD48g+byJiCvCfZDWYeWQd7aNbz63dff2drOZ8NVmN6x1gTsEqbb0vLR1FNvAwF/gTcFZE3L0uZWpRvukRMb2V9GXAqWQ/dEuAr5B1cTQvf4HsR2dW+j8d0DKPVpxHNjBxRerb/CrwU0lD1vc46kXNnOjbESTtBjxHdhrJutQ0akaejsWsWmqtBlc2SV+Q1FlSL7I+lVvrNSDk6VjMakHdBziy87reAv5G1pdyUnWLs17ydCxmVZerJqqZWaE81ODMzFpVUxfn9undENsNbu2cU6tVLz2zefsrWc14l3dYHavavXtJMSMP7haLFrd2ZtGHPfHMqskRMWp99rc+airAbTe4E9Mmt3murtWgkQOGVbsIVoZHY8p657FocSPTJm9T0roN/V9u70qMiqqpAGdmtS+AJpqqXYySOMCZWVmCYE2U1kStNgc4Myuba3BmlktB0Fgnp5c5wJlZ2ZqK38SkZjjAmVlZAmh0gDOzvKqXGpyvZDCzsgSwJqKkqRTpXoZ/lXRbet1b0t2SXk5/exWse4akmZJeLLiXYpsc4MysLEHQWOJUom+T3Zuw2enAlIgYAkxJr5E0lOw+gbuT3dfvcrXz+EwHODMrT0BjiVN7JA0ie5rY1QXJRwLj0/x4sqeGNadPiIhVETGb7Maww4vl7wBnZmXJrmQobQL6SHq8YDq+RXYXk93SvvDEuq3TowKaHxnQL6UP5IMP2ZnDBx8E9CEeZDCzMolGSr5ef2FE7NtqLtkTzRZExBOSRpS04w8rWk90gDOzsmSDDOt1Q5JmBwL/JOlwsgds95B0AzBfUv+ImKfsucQL0vpz+ODT5gbx/hP2WuUmqpmVJTsPTiVNRfOJOCMiBkXEdmSDB/8bEV8le9jOsWm1Y4Fb0vwksoePd5G0PdmDpKYV24drcGZWtqaOqcG15XxgoqTjgNeAL0H2RDJJE4EZZM/oHZOeLtcmBzgzK0tzDa5D88we4Xlfml8EHNrGemOBsaXm6wBnZmUJRGOd9G45wJlZ2SrcRO0wDnBmVpZArI6iFxDUDAc4MytLdqKvm6hmllMdPchQKQ5wZlaWCNEYrsGZWU41uQZnZnmUDTLUR+ioj1KaWc3wIIOZ5Vqjz4MzszzylQxmlmtNHkU1szzKLrZ3gDOzHArEGl+qZWZ5FIFP9DWzvJJP9DWzfApcgzOzHPMgg5nlUiDf8NLM8il7bGB9hI76KKWZ1ZCyHvxcVfXRkDazmhFkVzKUMhUjqaukaZKeljRd0k9S+tmS3pD0VJoOL9jmDEkzJb0oaWR7ZXUNzszK1kE1uFXAIRGxXFIn4CFJd6RlF0XEBYUrSxpK9oDo3YEBwD2Sdi72bFTX4MysLBHqkBpcZJanl53SFEU2ORKYEBGrImI2MBMYXmwfDnBmVpZskKGhpKk9khokPQUsAO6OiEfTopMlPSPpGkm9UtpA4PWCzeektDY5wJlZmbJnMpQyAX0kPV4wHV+YU0Q0RsQwYBAwXNIewBXAjsAwYB5w4Xs7/rBiNT73wZlZebJBhpL74BZGxL7t5hnxtqT7gFGFfW+SrgJuSy/nAIMLNhsEzC2Wr2twZla2RjYpaSpGUl9JPdP8ZsBngBck9S9Y7QvAc2l+EjBaUhdJ2wNDgGnF9uEanJmVpQOvZOgPjJfUQFbZmhgRt0m6XtIwssriK8AJABExXdJEYAawFhhTbAQVHODMbB10xENnIuIZYO9W0o8pss1YYGyp+3CAM7OyRMCapvro3XKAM7OyZE1UBzgzy6l6uRbVAa6DNDbCKaN2Zqv+azj3utlcdc4Apt7dg06dg/7bruK0i16n+5ZZf+iES/px501b0bBJcNJP32DfEcuqXPqN2/hHZ7ByeQNNTdC4Vpxy2M7sMHQlp5w/h826NTF/Tmd+PmYbViyvj+cQVFqZp4lUVUXrmZJGpYtiZ0o6vZL7qrY/X92XwUNWvfd6n4OWMe7eF7hyyosM3GEVEy7pB8CrL3Xhvlt6Me7eFxh74ywuPWMQjUXHgWxD+P6XduRb/7ALpxy2MwDfueB1rvlZf048dBcevqMHXzxpQZVLWEs65lKtDaFiJUhDv5cBhwFDgaPSxbK589bcTkyb0oPDvrLovbSPjVhGQ6of7/axFSyc1wmARyZvyYgjl9C5S/CRbVYzYLtVvPjXzatRbCti0I6reHZqNwD++sAWfPKzf69yiWpLU3ouQ3tTtVUyxA4HZkbErIhYDUwgu1g2d648ayDf+NFc1Ma7Ofmm3nz8kKwZunBeJ/oOWPPesj7917DozU4bopjWlhA/u2kWl975Eocdnf1IvfpiV/YfuRSATx3x9w98Zhu7bBS1oaSp2irZB9fahbGfaLlSujbteIBtBtZfl+DUu3vQs89ahnx0JU//X/cPLb/x11vTsGlwyD8vyRJau3Ku+j90G7XvHrkTi+d3Ysut1nD+hFm8PrMLv/reYE469w2O/u58HrmrB2tX+0Nq5luWZ0q6MDYixgHjAPbdq2vRC2dr0YzHujH1rh48NmUoq1eJFcsa+PnJ2/CDS1/j7om9mHZPD87/3UyU3o0+A9bw1tz3a2wL53Viq61dO6imxfOzz+Pvizrx8J1bsuveK/j9lf0486gdARi4wyo+cejSahax5tRC87MUlWyiln1hbD36+pnz+O0TM7hu2gzOuOJV9vrkMn5w6Ws8du8WTLxsa86+dhZdN38/bu/3j0u575ZerF4l3nytM2/M7sIue6+o4hFs3Lps1shm3Rrfm//Yp5fxygtd2XKr7EdHCr7y7fncdv1W1SxmTWkeRS1lqrZK1uAeA4aki2LfILsT51cquL+actkPB7FmlTjjyzsBsOvH3uHbP5/Ddru8y0Gfe5vjR+xKQ0Nw8s/m0FD9roqNVq++aznrN68A0LBpcO+fevH4fT34/HFv8bmvLQTg4Tu25K4JvatYytpTCyOkpVBE5VqF6V7qFwMNwDXpOrI27btX15g2eXCxVazGjBwwrNpFsDI8GlNYGovXq2rVa9d+ccg1Xyxp3T8eeMUTpdwuqVIq2qsfEbcDt1dyH2a24dVC87MU9TdsaWZVVU9XMjjAmVnZHODMLJd8HpyZ5Vq9nAfnAGdmZYmAtb7hpZnllZuoZpZL7oMzs1yLOglw9dGQNrOa0hH3g5PUVdI0SU9Lmi7pJym9t6S7Jb2c/vYq2OaMdAPdFyWNbK+cDnBmVpaIDrvYfhVwSETsBQwDRknaDzgdmBIRQ4Ap6TXphrmjgd2BUcDl6ca6bXKAM7MyicamTUqaionM8vSyU5qC7Ma441P6eODzaf5IYEJErIqI2cBMshvrtskBzszKFqGSpvZIapD0FLAAuDsiHgW2joh52X5iHtAvrd7aTXQHFsvfgwxmVpYyr0XtI+nxgtfj0k1us7wiGoFhknoCf5K0R5G8SrqJbiEHODMrT2T9cCVaWMrtkiLibUn3kfWtzZfUPyLmSepPVruDdbiJrpuoZla2DhpF7ZtqbkjaDPgM8AIwCTg2rXYscEuanwSMltQl3Uh3CDCt2D5cgzOzskQaZOgA/YHxaSR0E2BiRNwm6RFgoqTjgNeALwFExHRJE4EZwFpgTGritskBzszK1hE3Ao+IZ4C9W0lfBBzaxjZjgaJ3Bi/kAGdmZauXKxkc4MysLBEOcGaWY77Y3sxyq4IP4+tQDnBmVpZANPmGl2aWV3VSgXOAM7MyeZDBzHKtTqpwDnBmVra6r8FJuoQicToiTq1IicyspgXQ1FTnAQ54vMgyM9tYBVDvNbiIGF/4WlK3iHin8kUys1pXL+fBtXsyi6T9Jc0Ank+v95J0ecVLZma1K0qcqqyUs/UuBkYCiwAi4mngoAqWycxqWmm3K6+FgYiSRlEj4nXpA4Uteg8mM8u5GqidlaKUAPe6pAOAkNQZOJXUXDWzjVBA1MkoailN1BOBMWRPr3mD7PmFYypYJjOreSpxqq52a3ARsRA4egOUxczqRZ00UUsZRd1B0q2S3pK0QNItknbYEIUzsxqVo1HUG4GJZA+IGADcDNxUyUKZWQ1rPtG3lKnKSglwiojrI2Jtmm6gJmKzmVVLRGlTtRW7FrV3mr1X0unABLLA9mXgLxugbGZWq+pkFLXYIMMTZAGt+UhOKFgWwLmVKpSZ1TZ1QO1M0mDgOuAjQBMwLiJ+Lels4JvAW2nVMyPi9rTNGcBxZOfinhoRk4vto9i1qNuv9xGYWf503ADCWuC0iHhS0hbAE5LuTssuiogLCleWNBQYDexONh5wj6Sdiz38uaQrGSTtAQwFujanRcR1ZR2KmeVExwwgRMQ8YF6aXybpebLzbdtyJDAhIlYBsyXNBIYDj7S1QSmniZwFXJKmg4FfAP9U6kGYWQ6VfppIH0mPF0zHt5adpO3InnL/aEo6WdIzkq6R1CulDQReL9hsDsUDYkmjqF8EDgXejIh/A/YCupSwnZnlVVOJEyyMiH0LpnEts5LUHfgD8J2IWApcAexIdtXUPODC5lVbKUnRxnIpTdSVEdEkaa2kHsACwCf6mm2sOvCGl5I6kQW330bEHwEiYn7B8quA29LLOcDggs0HAXOL5V9KDe5xST2Bq8hGVp8EppVYfjPLIUVpU9E8slsU/QZ4PiJ+VZDev2C1LwDPpflJwGhJXSRtDwyhnVhUyrWo30qzV0q6E+gREc+0t52Z5VjHjKIeCBwDPCvpqZR2JnCUpGFpL6+QTlGLiOmSJgIzyEZgxxQbQYXiJ/ruU2xZRDxZ8mGYmbUQEQ/Rer/a7UW2GQuMLXUfxWpwFxZZFsAhpe6kVM+91Zehl32r/RWtZmw35M1qF8HKoFcf7Jh8auAyrFIUO9H34A1ZEDOrE0EuLtUyM2tdvdfgzMzaUvdNVDOzNtVJgCvlUi1J+qqkH6fX20gaXvmimVnNytEdfS8H9geOSq+XAZdVrERmVtNKPcm3FpqxpTRRPxER+0j6K0BELEmPDzSzjVWORlHXSGogVTgl9aX5Mloz2yjVQu2sFKU0Uf8L+BPQT9JY4CHgZxUtlZnVtjrpgyvlWtTfSnqC7JZJAj4fEX6yvdnGqkb610rRboCTtA2wAri1MC0iXqtkwcyshuUlwJE9Qav54TNdge2BF8nui25mGyHVSS98KU3UPQtfp7uMnNDG6mZmNaPsKxnSE3A+XonCmFmdyEsTVdL3Cl5uAuzD+88rNLONTZ4GGYAtCubXkvXJ/aEyxTGzupCHAJdO8O0eEf+xgcpjZvWg3gOcpE0jYm2xW5eb2cZH5GMUdRpZf9tTkiYBNwPvNC9sfsSXmW1kctYH1xtYRPYMhubz4QJwgDPbWNVJgCt2LWq/NIL6HPBs+js9/X2uyHZmlncdcC2qpMGS7pX0vKTpkr6d0ntLulvSy+lvr4JtzpA0U9KLkka2V8xiAa4B6J6mLQrmmycz20h10P3g1gKnRcRuwH7AGElDgdOBKRExBJiSXpOWjSa7imoUcHkaCG1TsSbqvIg4p4RjNbONTQc0USNiHjAvzS+T9DwwEDgSGJFWGw/cB/wgpU+IiFXAbEkzgeHAI23to1iAq4872pnZhhVljaL2kfR4wetxETGu5UqStgP2Bh4Ftk7Bj4iYJ6lfWm0gMLVgszkprU3FAtyh7ZfdzDZKpdfgFkbEvsVWkNSd7OKB70TEUqnNulVrC4qWpM0+uIhYXGxDM9t4ddQzGSR1Igtuvy049Wy+pP5peX9gQUqfAwwu2HwQMLdY/qXc0dfM7IM6ZhRVwG+A5yPiVwWLJgHHpvljgVsK0kdL6iJpe2AI2fm6bfJzUc2sPB13O/IDgWOAZyU9ldLOBM4HJko6DngN+BJAREyXNBGYQTYCOyYiGovtwAHOzMoiOuZKhoh4iLYHM1sdA4iIscDYUvfhAGdmZcvTpVpmZh/kAGdmueUAZ2a5lLO7iZiZfZADnJnlVR5ueGlm1io3Uc0snzruRN+Kc4Azs/I5wJlZHnXUlQwbggOcmZVNTfUR4RzgzKw87oMzszxzE9XM8ssBzszyyjU4M8svBzgzy6XynqpVVQ5wZlYWnwdnZvkW9RHhHODMrGyuwW0kPtJ9OecdOoU+m68gEBOnD+WGZz7KyB3/xpjhj7FDryV8+eZ/Yfpb2cO5t+zyLhePmsyeWy/gT8/vytgHP1XlI7Aj/2UmI494BQnuvG07bvn9Tuyw09uc/L2n6NS5iaZGcdlFe/HSC72rXdTaUEcn+lbsuaiSrpG0QNJzldpHLVjbJH7x8AF87qajGP37f+Yrez7Hjr0W8/Li3px6x0genzvgA+uvbmzgkmnD+eXDB1SpxFZo2+2XMvKIV/juiSMYc9whDN//TQYMXM7XT5zOjeN35ZRvHML11+zG10+cXu2i1hQ1lTa1m08rcULS2ZLekPRUmg4vWHaGpJmSXpQ0sr38K/ng52uBURXMvyYsXNGN5xf2BWDFms7MWtKLft3eYdaSXrzydq8Prb9ybSeenNefVY0NG7qo1orB2y7jxRm9WbVqU5oaN+G5p/twwEFziYDNN18LQLfua1i8qGuVS1pbOirA0XacuCgihqXpdgBJQ4HRwO5pm8slFf0iVSzARcQDwOJK5V+LBmyxlN36LOSZ+VtXuyhWoldnb8Eeey1kix6r6NJlLfvu9yZ9+q1k3KV78vWTnmP8zXdy3EnPce243atd1NoRZIMMpUztZVVenDgSmBARqyJiNjATGF5sg6r3wUk6HjgeYNMeH67x1IvNO63h16Mmc95DB/LOms7VLo6V6PVXe3DzjTsz9sKHeXflpsyeuSWNa8XhR87mqkv35OEHBvKpg+fw7e8/yQ9P+2S1i1szyhhk6CPp8YLX4yJiXAnbnSzpX4HHgdMiYgkwEJhasM6clNamSjZRSxIR4yJi34jYt6Fbt2oXZ51sukkjF4+azG0v7cw9s3aodnGsTHfdvh2nfvMQvn/qQSxb1pm5b3TnMyNf4+EHsv7TB+8dyC67LalyKWtMlDjBwubvd5pKCW5XADsCw4B5wIUpXW2UpE1VD3D1Lzj34PuYtaQn45/eq9qFsXWwZc9VAPTtt4IDPjWX++8ZxKJFXdlz2EIA9trnLd6Y072aRawpzSf6ljKti4iYHxGNEdEEXMX7zdA5wOCCVQcBc4vlVfUmar3bp/+bHLnrS7y4sDd//PJEAC6e+gk6bdLIDw96iN6breSKI27nhYV9OP7WIwC4+5gb6N55NZ0aGjl0h9l8c9IR/G2JT0Golh+e+yg9eqxm7Vpx+cV7sXx5Z/7rl3tzwinP0tDQxJrVDVxywbBqF7N2RFT0hpeS+kfEvPTyC0DzCOsk4EZJvwIGAEOAacXyqliAk3QTMIKsDT4HOCsiflOp/VXLk/P6M/Syk1pdNmV2683Vf7j+q5UskpXp+6cc9KG0Gc/24dvHH1yF0tSJDopvrcUJYISkYWkvrwAnAETEdEkTgRnAWmBMRDQWy79iAS4ijqpU3mZWXR11JUMbcaLNilBEjAXGlpq/m6hmVp4A/EwGM8ut+ohvDnBmVj5fbG9mueXHBppZPtXR3UQc4MysLNmJvvUR4RzgzKx8fiaDmeWVa3Bmlk/ugzOz/KrstagdyQHOzMrnJqqZ5ZIf/GxmueYanJnlVn3ENwc4MyufmuqjjeoAZ2blCXyir5nlkwif6GtmOeYAZ2a55QBnZrnkPjgzy7N6GUX1g5/NrEyRNVFLmdoh6RpJCyQ9V5DWW9Ldkl5Of3sVLDtD0kxJL0oa2V7+DnBmVp6gwwIccC0wqkXa6cCUiBgCTEmvkTQUGA3snra5XFJDscwd4MysfE0lTu2IiAeAxS2SjwTGp/nxwOcL0idExKqImA3MBIYXy999cGZWtjLOg+sj6fGC1+MiYlw722wdEfMAImKepH4pfSAwtWC9OSmtTQ5wZla+0gPcwojYt4P2qtZKUmwDBzgzK08ENFZ0FHW+pP6p9tYfWJDS5wCDC9YbBMwtlpH74MysfB03yNCaScCxaf5Y4JaC9NGSukjaHhgCTCuWkWtwZla+DrqSQdJNwAiyvro5wFnA+cBESccBrwFfynYZ0yVNBGYAa4ExEdFYLH8HODMrTwAd9EyGiDiqjUWHtrH+WGBsqfk7wJlZmQKiPq5kcIAzs/IElR5k6DAOcGZWPt9NxMxyywHOzPJpvU4B2aAc4MysPAHUye2SHODMrHyuwZlZPlX8Uq0O4wBnZuUJCJ8HZ2a51UFXMlSaA5yZlc99cGaWSxEeRTWzHHMNzszyKYjGoncpqhkOcGZWng68XVKlOcCZWfl8moiZ5VEA4RqcmeVS+IaXZpZj9TLIoKih4V5JbwGvVrscFdAHWFjtQlhZ8vqZbRsRfdcnA0l3kr0/pVgYEaPWZ3/ro6YCXF5JerwDH35rG4A/s3zwc1HNLLcc4MwstxzgNoxx1S6Alc2fWQ64D87Mcss1ODPLLQc4M8stB7gKkjRK0ouSZko6vdrlsfZJukbSAknPVbsstv4c4CpEUgNwGXAYMBQ4StLQ6pbKSnAtULUTU61jOcBVznBgZkTMiojVwATgyCqXydoREQ8Ai6tdDusYDnCVMxB4veD1nJRmZhuIA1zlqJU0n5NjtgE5wFXOHGBwwetBwNwqlcVso+QAVzmPAUMkbS+pMzAamFTlMpltVBzgKiQi1gInA5OB54GJETG9uqWy9ki6CXgE2EXSHEnHVbtMtu58qZaZ5ZZrcGaWWw5wZpZbDnBmllsOcGaWWw5wZpZbDnB1RFKjpKckPSfpZkmbr0de10r6Ypq/utiNACSNkHTAOuzjFUkfevpSW+kt1lle5r7OlvTv5ZbR8s0Brr6sjIhhEbEHsBo4sXBhuoNJ2SLiGxExo8gqI4CyA5xZtTnA1a8HgZ1S7epeSTcCz0pqkPRLSY9JekbSCQDKXCpphqS/AP2aM5J0n6R90/woSU9KelrSFEnbkQXS76ba46ck9ZX0h7SPxyQdmLbdStJdkv4q6b9p/XrcD5D0Z0lPSJou6fgWyy5MZZkiqW9K21HSnWmbByXt2iHvpuWSn2xfhyRtSnafuTtT0nBgj4iYnYLE3yPi45K6AA9LugvYG9gF2BPYGpgBXNMi377AVcBBKa/eEbFY0pXA8oi4IK13I3BRRDwkaRuyqzV2A84CHoqIcyR9FvhAwGrD19M+NgMek/SHiFgEdAOejIjTJP045X0y2cNgToyIlyV9ArgcOGQd3kbbCDjA1ZfNJD2V5h8EfkPWdJwWEbNT+j8CH23uXwO2BIYABwE3RUQjMFfS/7aS/37AA815RURb90X7DDBUeq+C1kPSFmkf/5y2/YukJSUc06mSvpDmB6eyLgKagN+l9BuAP0rqno735oJ9dylhH7aRcoCrLysjYlhhQvqiv1OYBJwSEZNbrHc47d+uSSWsA1nXxv4RsbKVspR87Z+kEWTBcv+IWCHpPqBrG6tH2u/bLd8Ds7a4Dy5/JgMnSeoEIGlnSd2AB4DRqY+uP3BwK9s+Anxa0vZp294pfRmwRcF6d5E1F0nrDUuzDwBHp7TDgF7tlHVLYEkKbruS1SCbbQI010K/Qtb0XQrMlvSltA9J2qudfdhGzAEuf64m6197Mj045b/Jaup/Al4GngWuAO5vuWFEvEXWb/ZHSU/zfhPxVuALzYMMwKnAvmkQYwbvj+b+BDhI0pNkTeXX2inrncCmkp4BzgWmFix7B9hd0hNkfWznpPSjgeNS+abj28BbEb6biJnllmtwZpZbDnBmllsOcGaWWw5wZpZbDnBmllsOcGaWWw5wZpZb/x+7xE13B/dG3QAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Logistic Regression Model Accuracy (on testing set): \n", + "0.6573604060913706\n" + ] + } + ], + "source": [ + "lr = LogisticRegression(max_iter = 10000)\n", + "lr_model = lr.fit(X_train, y_train.values.ravel())\n", + "y_pred = lr_model.predict(X_test)\n", + "\n", + "disp = plot_confusion_matrix(lr_model, X_test, y_test)\n", + "disp.ax_.set_title('Logistic Regression Confusion Matrix')\n", + "\n", + "plt.show()\n", + "print('Logistic Regression Model Accuracy (on testing set): ')\n", + "print(accuracy_score(y_test, y_pred))" + ] + }, + { + "cell_type": "code", + "execution_count": 97, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAATgAAAEWCAYAAADy2YssAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAAjEElEQVR4nO3debwcVZ338c83NyF7SEIWEggkIMgq0YEIuEVgJKAjooOCoCAg4AjMozPK8oggjM84yuIKTFgEZFCD4DoIQhARBdm3BJBAgAAhe8gCZLn39/xRp0nncrtvd9Kd7q77fedVr1RXVVedrtv3d0+dU3V+igjMzPKoV6MLYGZWLw5wZpZbDnBmllsOcGaWWw5wZpZbDnBmllsOcBWQdI6kaxtdjp5A0hckzZO0QtIWG7GfFZK2q2XZNjVJR0r6Q6PL0cpaNsBJek7S6+mL/IqkqyQNanS5NoakyZI60mcqTL/dhMcfLykk9e5mux0lXS9poaRXJT0q6cuS2jby+H2AC4EPRcSgiFi0oftK7392Y8rTlfS9Wy1pRKflD6dzN76CfVR0niPifyLiQxtZ5B6tZQNc8k8RMQiYCLwTOKOxxamJl9MvZ2H6p2p3sLGBppt9bw/8DZgD7B4RmwOHAXsCgzdy96OBfsCMjdxPvc0Gjii8kLQ70L+WB+gu+FllWj3AARARrwC3kAU6ACSdLukZScslzZR0aNG6YyTdJel8SUskzZZ0UNH6CZL+lN57K9D5r/VHJc2QtFTSHZJ2Llr3nKSvpFrNSklXSBot6fdpf7dJGlbtZ5S0czrW0nTsjxatu0rSJZJukrQS+KCksZJukLQgfb5Ti7afJOl+ScvS5eCFadWd6f+lqfa4TxdF+Qbw14j4ckTMTef/qYj4dEQsrfD8/Hs6P69K+rmkfpJ2BJ4qOv7tXdV00v6OT/NvSz+nV1Nt8udF24Wkt6X5zSVdk87F85K+JqlXWlf2u1DCT4DPFr0+Grim08/rw5IeSud4jqRzila/5TyncvxF0kWSFgPnFMqW9rdv+ozj0us90vndqZuy9mwR0ZIT8BxwQJrfGngM+F7R+sOAsWRB/FPASmBMWncMsAb4PNAGfAF4GVBafzfZpVJf4P3AcuDatG7HtK9/BPoAXwVmAZsVlesestrIVsB84EGyGmZf4Hbg7BKfaTLwYhfL+6RjnAlsBuyXyvT2tP4q4FXgPenzDgAeAL6ett8OeBY4sOjzfSbNDwL2TvPjgQB6lznvrwCfK7O+kvNzb/rZDAeeAE7q6vhdlQe4Azg+zf8U+L/pM/cD3lu0XQBvS/PXAL8mq2GOB/4OHFfJd6HU944sGO+c3jMH2DYdc3zRz3L3VLZ3APOAj5X5XMcAa4FTgN5kNcJjgLuKtvkm2fenP/AocHKjfw+bfWp4ATa44NkXbQXZL3oA04GhZbZ/GDik6Ms0q2jdgLSPLYFt0hdtYNH661gX4M4CphWt6wW8BEwuKteRRetvAC4pen0K8KsSZZwMdABLi6ZPAu8jCyy9irb9KXBOmr8KuKZo3buBFzrt+wzgx2n+TrKa2IhO27zlF6+LMq4BppRZX8n5Oapo/beBS7s6folAcAfrAtw1wFRg6y7KEcDbyALQKmCXonUnAnd0910o8707APga8J/AFOBWsqD0ZoDr4n3fBS4q87mO6eJndgzrB7g+ZH+4HgNupkQQ9rRuavVL1I9FxGCywLATRZeSkj6bGn6XSloK7Mb6l5qvFGYi4rU0O4isZrEkIlYWbft80fzY4tcR0UH2F3yrom3mFc2/3sXrcp0hL0fE0KJpWjrmnHSs4jIVH3NO0fy2wNjCZ0+f/0yyWiXAcWQ1rScl3SfpI2XK09kiYEyZ9ZWcn1eK5l+j/Pko56uAgHvTJfGxXWwzgqwWW/wz7HzuSn0XyvkJ8GmyIHRN55WS3i3pj+my+FXgJDo1dXRhTrmVEbGG7I/ZbsAFkaKeldbqAQ6AiPgT2Q/+fABJ2wKXAScDW0TEUOBxsl+G7swFhkkaWLRsm6L5l8kCCOlYAsaR1VLq5WVgXKHdqKhMxccs/rLPAWZ3CpSDI+JggIh4OiKOAEYB/wX8In3eSn5hbgM+0U1Za3V+Cn9kBhQt27IwExGvRMTnI2IsWa3s4kK7W5GFZLXObYuWdT53VYuI58k6Gw4Gbuxik+uA3wDjIuuIuZR1379S57ns+Ze0FXA28GPgAkl9N6DoPUouAlzyXeAfJU0ECr+sCwAkfY7sr1630hf3fuAbkjaT9F6guCdzGvBhSfsru63h38gugf5ao8/Rlb+R/bJ/VVIfSZNTmX5WYvt7gWWSTpPUX1KbpN0k7QUg6ShJI1Ptaml6TzvZ+eoga7Mr5WxgX0nfkbRl2t/bJF0raSg1PD8RsYAsEB2VPsOxwPaF9ZIOk7R1ermE7Gfe3mkf7alM35Q0OP3x+zJQi/sajwP261TbLxgMLI6INyRNIqvtFVRynteT/lBcBVyRjjsXOG8Dy91j5CbApV+Ga4CzImImcAFZY/o8ssbev1Sxu0+TtWMtJvuFfvMSJCKeAo4CfkBWO/gnsttVVtfgY3Qp7fujwEHpmBcDn42IJ0ts357KNZGslrEQuBzYPG0yBZghaQXwPeDwiHgjXZ59E/hLurTdu4t9PwPsQ9aONCNdft1A9kdheR3Oz+eBr5BdGu/K+oFyL+Bv6XP8BvjXiJjdxT5OIfsD8SxwF1nt6soNLM+bIuKZiLi/xOp/Ac6VtJyss2da0fu6Pc9dOJWsieGsdGn6OeBzkt63UR8i5wq9hmZmuZObGpyZWWcOcGaWWw5wZpZbDnBmlltN9UDviOFtMX5cn0YXw6rw90cHdL+RNY03WMnqWFXJ/aAlHfjBgbFocXv3GwIPPLrqloiYUm4bZYND3A+8FBEfSc/tfp50mxdwZkTclLY9g+w2mXbg1Ii4pdy+myrAjR/Xh3tvGdfoYlgVDhw7sdFFsCr8LaZv9D4WLW7n3lu26X5DoG3M0909vQHwr2TPJA8pWnZRRJxfvJGkXYDDyW4XGgvcJmnHdFtUl3yJamZVCaCjwn/dSTdqf5jsPs3uHAL8LCJWpfsdZwGTyr3BAc7MqhIEa6K9ogkYoWxorsJ0QqfdfZfsmeLO0fBkZUNqXal1w4ttxfrP677I+s8Uv4UDnJlVrYoa3MKI2LNomlrYRxrkYX5EPNBp95eQPZI3keyRtAsKb+miKGWfVGiqNjgza35B0F6bJ6DeA3xU0sFk4/kNkXRtRBxV2EDSZcDv0ssXyQZuKNiabHCHklyDM7OqdRAVTeVExBkRsXVEjCfrPLg9Io6SVDwc16FkIwFB9rzx4ZL6SpoA7EA2sERJrsGZWVWyIVvq+gz7t9OoQEE2wOiJABExQ9I0YCbZoLRfLNeDCg5wZrYBuqudVSsi7iAbrZmI+EyZ7b5JNhJLRRzgzKwqAaxpkVGIHODMrCpB1PsStWYc4MysOgHtrRHfHODMrDrZkwytwQHOzKok2ivK39R4DnBmVpWsk8EBzsxyKLsPzgHOzHKqwzU4M8sj1+DMLLcC0d4ij7E7wJlZ1XyJama5FIjV0dboYlTEAc7MqpLd6OtLVDPLKXcymFkuRYj2cA3OzHKqo0VqcK0Rhs2saWSdDL0rmiohqU3SQ5J+l14Pl3SrpKfT/8OKtj1D0ixJT0k6sLt9O8CZWVUKnQyVTBUqJH4uOB2YHhE7ANPT686Jn6cAF0sq253rAGdmVWsPVTR1p0Ti50OAq9P81cDHipZXlfjZbXBmVpUqn2QYIen+otdTi3Ojsi7x8+CiZaMjYi5ARMyVNCot3wq4p2i7bhM/O8CZWdU6Ku9FXRgRe3a1ojjxs6TJFezLiZ/NrL6yh+1r0rrVZeJnYJ6kMan2NgaYn7Z34mczq69ArIm2iqay+ymR+JkswfPRabOjgV+neSd+NrP6iqDeN/p+C5gm6TjgBeCw7LhO/Gxmdaea3+jbKfHzImD/Ets58bOZ1U9Q9xpczTjAmVnVPOClmeVSIA94aWb5lKUNbI3Q0RqlNLMm4sTPZpZTQVVPMjSUA5yZVc01ODPLpQi5Bmdm+ZR1MjirlpnlknMymFlOZZ0MboMzs5zykwxmlkt+ksHMcs2Z7c0slyJgTYcDnJnlUHaJ2hoBrjVKaWZNpT09j9rdVI6kfpLulfSIpBmSvpGWnyPpJUkPp+ngovdUlfjZNbgaaW+HU6bsyBZj1nDeNbP5yflb8vvrhrP58GxE5c+d8TKT9l8OwLMz+/H908axcnkvevWCH9z0dzbrVzY5kNXJyLGr+cr3XmDYqLVEB9x07Rb86oqRfPYrc9nnwGVEwNKFvTn//2zD4nl9Gl3cplDD20RWAftFxApJfYC7JP0+rbsoIs4v3rhT4uexwG2Sdiw3bHldA5ykKcD3gDbg8oj4Vj2P10i/unwk43ZYxWsr1lWKD/38Ag77woL1tmtfC98+ZVu+8v3n2X7XN1i2uI22Pg5ujdK+Vkw9dyyzHhtA/4Ht/PDmv/PgnYP5xSWjuOY7YwA45LgFHPWleXz/9K0bXNpmUZtL1IgIYEV62SdN5X4Z3kz8DMyWVEj8fHepN9TtElVSG/Aj4CBgF+CIFIFzZ8HLfbh3+hAO+vSibrd94E+DmbDz62y/6xsADBneTltrPPWSS4vn92HWYwMAeH1lG3Nm9WPEmDW8tmLdD6Vf/w7Cf4PW05HyMnQ3dUdSm6SHyVID3hoRf0urTpb0qKQrJQ1Ly7YC5hS9vdvEz/Vsg5sEzIqIZyNiNfAzsgicO5eevRXHf+1l1Ols/vbHIzlp/7dzwZfGsXxp9gvz4rP9kODMI7bjix/akWk/GtXFHq0RRm+9mu13e50nH8wC3jGnzeXa+2ey38eXcs13tmxw6ZpH1ovaVtFEymxfNJ2w/r6iPSImkuU4nSRpN+ASYHtgIjAXuCBtXnXi53oGuIqiraQTCh9+waKyGcCa0j23DmHoiLXs8I7X11v+kaMX8uO7Z3LxrU8xfPQapn5jLJBdoj5+70BO++HzXPCrp/nrzZvz0J8HNaLoVqTfgHbOuvw5Lv362Ddrb1f91xiO2nMXbr9xKB89dmGDS9g8Cjf6VjKRMtsXTVO73GfEUrKsWlMiYl4KfB3AZWSVJWiyxM8VRduImFr48CO3aL1rtZn3DeSePwzhs5N24T+/sC2P3DWY/zp5G4aNXEtbG/TqBQcduZinHs5qBSPHrOEd+6xk8y3a6Tcg2Gu/Zcx6rH+DP0XP1tY7OOvy57j9xmH85fdD37L+j78cxnsPfnXTF6yJ1eISVdJISUPTfH/gAODJlM2+4FDg8TTfVImfq462rejYM+dy7JlzAXjkr4P4xaUjOe2HL7BoXm+2GL0WgL/+fnPGvz1rc/uHycu5/uJRvPGa6LNZ8Ojdg/j4CQtK7t/qLfjyBXOY83Q/bpw68s2lYyes4uXZfQHY+8BXmTOrb6MK2HRq2Is6Brg6tdf3AqZFxO8k/UTSxHSo54ATofkSP98H7JAi7Utk3bufruPxmsoV/zGWZ2b0R8radk79dna1PnhoOx8/cQGnHLwjEkzabxnvPmBZg0vbc+06aSUHHLaEZ2f24+JbnwLgx/85hilHLGbr7VfR0QHzX9qM75/mHtRiNepFfRR4ZxfLP1PmPc2R+Dki1ko6GbiF7DaRKyNiRr2O1wz22HcFe+yb9Xp/9QcvlNxu/08sYf9PLNlUxbIyZtw7iAPH7vGW5ffdPqQBpWkNEWJtizzJUNf74CLiJuCmeh7DzDY9jyZiZrnkAS/NLNcc4MwslzzgpZnlWiWPYTUDBzgzq0oErPWAl2aWV75ENbNcchucmeVaOMCZWV65k8HMcinCbXBmllui3b2oZpZXboMzs1zys6hmll9ByyThcYAzs6q1Si9qa7QUmlnTiNTJUMlUTpnM9sMl3Srp6fT/sKL3VJXZ3gHOzKoWUdnUjUJm+z3IUgROkbQ3cDowPSJ2AKan150z208BLk75HEpygDOzqkWooqn8PiIioqvM9ocAV6flVwMfS/NvZraPiNlAIbN9SQ5wZlaVrHZWcYArm/i5RGb70RExNztWzAUK2dGrzmzvTgYzq1oVt4ksjIg9S61Maf8mpvyov0yZ7Utpqsz2ZpZTNWqDK9rfusz2wLxC8uf0//y0WVNltjezHApER0eviqZySmW2J8tgf3Ta7Gjg12m+qTLbm1lO1eg+31KZ7e8Gpkk6DngBOAyaL7O9meVR1OZZ1DKZ7RcB+5d4T3NktjezHPOjWmaWVy0/moikH1AmTkfEqXUpkZk1tQA6Olo8wAH3b7JSmFnrCKDVa3ARcXXxa0kDI2Jl/YtkZs2uVYZL6vY+OEn7SJoJPJFe7yHp4rqXzMyaV1Q4NVglN/p+FzgQWAQQEY8A769jmcysqVX2HGozdERU1IsaEXOk9Qpb9uY6M8u5JqidVaKSADdH0r5ASNoMOJV0uWpmPVBAtEgvaiWXqCcBXyQbluQlsoHpvljHMplZ01OFU2N1W4OLiIXAkZugLGbWKlrkErWSXtTtJP1W0gJJ8yX9WtJ2m6JwZtakctSLeh0wjezJ/7HA9cBP61koM2tihRt9K5karJIAp4j4SUSsTdO1NEVsNrNGqfWAl/VS7lnU4Wn2j5JOB35GFtg+BfzvJiibmTWrFulFLdfJ8ABZQCt8khOL1gVwXr0KZWbNTU1QO6tEyUvUiJgQEdul/ztP7mQw66kq7WDoJghKGifpj5KeSImf/zUtP0fSS5IeTtPBRe+pKvFzRU8ypEw3uwD93vyMEddU8l4zy5uadSCsBf4tIh6UNBh4QNKtad1FEXH+ekddP/HzWOA2STuWG7a82wAn6WxgMlmAuwk4CLgLcIAz66lqcImacp4W8p8ul/QE5fOcvpn4GZgtqZD4+e5Sb6ikF/WfycZHfyUiPgfsAfSt7COYWS51VDh1k/i5QNJ4svwMf0uLTpb0qKQrJQ1Ly6pO/FxJgHs9IjqAtZKGkOUodBucWU9V3X1wCyNiz6JpaufdSRoE3AD8n4hYBlwCbE/2WOhc4ILCpiVKU1IlbXD3p9yFl5H1rK6gm1yEZpZvtepFldSHLLj9T0TcCBAR84rWXwb8Lr2sOvFzJc+i/kuavVTSzcCQlO7LzHqqGgQ4ZWOwXQE8EREXFi0fk9rnAA4FHk/zvwGuk3QhWSfDhid+lvSucusi4sGKPoWZWdfeA3wGeEzSw2nZmcARkiaShdHnSPfg1jrx8wVl1gWwX/flr85jy0Yw4ebja71bq6Odh/y90UWwKmhFJc3uFeynNr2od9F1u9pNZd5Tm8TPEfHBSndiZj1IkItHtczMutYij2o5wJlZ1VrlWVQHODOrXosEuEpG9JWkoyR9Pb3eRtKk+hfNzJpWjkb0vRjYBzgivV4O/KhuJTKzpqaofGq0Si5R3x0R75L0EEBELEnpA82sp8pRL+oaSW2kCqekkRQeozWzHqkZameVqOQS9fvAL4FRkr5JNlTS/6trqcysubVIG1wlz6L+j6QHyIZMEvCxiHBme7Oeqkna1ypRyYCX2wCvAb8tXhYRL9SzYGbWxPIS4MgyaBWSz/QDJgBPkQ0bbGY9kFqkFb6SS9Tdi1+nUUZOLLG5mVnTqPpJhpQgYq96FMbMWkReLlElfbnoZS/gXcCCupXIzJpbnjoZgMFF82vJ2uRuqE9xzKwl5CHApRt8B0XEVzZRecysFdRmyPJxZOlHtyR7eGBqRHxP0nDg58B4shF9PxkRS9J7zgCOA9qBUyPilnLHKHmjr6TeaTjgkkOXm1nPI7Je1EqmbhQSP+8M7A18MSV3Ph2YHhE7ANPT686Jn6cAF6dKWEnlanD3kgW3hyX9BrgeWFlYWciAY2Y9TI3a4Mokfj6ELNk8wNXAHcBpbEDi50ra4IYDi8hyMBTuhwvAAc6sp6o8wI2QdH/R66klcqOOZ13i59GFrFoRMVfSqLTZVsA9RW/rNvFzuQA3KvWgPs66wFbQIk2MZlYXlUeAhRGxZ7kNOid+zrIJdr1ptSUpF+DagEEbslMzy7d6Jn4G5hVyo0oaA8xPy2ua+HluRJy7geU2szyrY+JnsgTPRwPfSv//umh5bRI/03XNzcx6uqjZs6ilEj9/C5gm6TjgBeAwqH3i5/03ruxmllv1TfwMJeJPLRM/L650J2bWs+TpUS0zs/U5wJlZLjXJcOSVcIAzs6oIX6KaWY45wJlZfjnAmVluOcCZWS7lbERfM7P1OcCZWV7lJm2gmVlnvkQ1s3zyjb5mlmsOcGaWR36SwcxyTR2tEeEc4MysOi3UBlcyL6qZWSmKyqZu9yNdKWm+pMeLlp0j6SVJD6fp4KJ1Z0iaJekpSQd2t38HODOrXlQ4de8qsiTOnV0UERPTdBNsWOJnBzgzq1qtanARcSdQ6ejhbyZ+jojZQCHxc0kOcGZWvcprcCMk3V80nVDhEU6W9Gi6hB2Wlm0FzCnaZqMSP5uZvVV1WbW6TfzchUuA87IjcR5wAXAsNU78bGb2FvW+Dy4i5r15LOky4HfpZdWJn32JambVi6hs2gApm33BoUChh/U3wOGS+kqawEYmfjYz61KtanCSfgpMJmurexE4G5gsaSLZ5edzwIlQ+8TPVoHei1ez5eWzaXt1DQhe/cBIlv7jaACG3jaPodPnE21i5Ts2Z+Enx8HaDkZf9Tz9nn8NOoJl+27Bkg+P6eYoVm+9egXf+8VDLJrfl3NO2pX3HriAI09+gXHbv8aXPjmRpx8f3OgiNo8a3ugbEUd0sfiKMtvXJvHzxpJ0JfARYH5E7Fav4zRa9IIFn9qaVdsORK+3s+25M3ltlyG0LVvDwIeW8vy5uxJ9etG2bA0Ag+9fgtZ28Px5u6JV7Yz/2gyWv3s4a0f0bfAn6dkO+exLzHl2AAMGZRWC558eyH+cujOnfGNWg0vWnFplPLh6tsFdRdc38OVK+9DNWLXtQACifxurx/Sn99LVDP3jApYcPIbok53i9iF93nxPr1Ud0B5oTRC9RUe/svcqWp1tMXoVe31gMbdcv+Wby+Y8O4CXZg9oYKmamzoqmxqtbjW4iLhT0vh67b8Z9V64ir4vvMYb2w1ixLQX6f/0cra48SWij1jwqXGsmjCQ5XsOY+DDS9nuS4/Qa3UHCw4fR8cgtxQ00olnPsOV50+g/8CyzTlWEGxwB8Km1vBeVEknFG4CbF++stHF2WB6o52xP3qGBUeMo6N/G+oIeq1sZ87XdmLhJ7dm7CXPQAT9Zq+EXuLZC9/B7G/vzrBbXqHP/FWNLn6PNWnyIpYu2oxZM9zGVo1aPclQbw2vOkTEVGAqQN8JWzfBKdkAazsY+6NnWLb3cFb8Q3bT9dphm7HiH4aCxBvbDSIk2pavZcg9i1m52+bQuxftQ3rx+g6D6PvcStaMchtcI+zyrmXsvd8i9vrAYvps1sGAQe38+7ef5Pyv7tToojW3FvlNbXiAa3kRbPnj51k9ph9LD1zXhrPinUMZ8MRyXt9pCH1eeQOt7aB9cG/WbLEZA55YxvJ9hqPVHfR7ZiVLUq+rbXpXXTiBqy6cAMDuk5byiWNfcnDrhge87EH6Pb2CIXcvYtXW/dnm7BkALPrEVrz6vhFseeVzbHvW40RbL145fgJILN1vVFo+AwKWvXcEq8e5MbvZ7HPAQr7wtWfYfPgazrl0Bs8+OZCzjt+90cVqDhEe8LKrG/giouT9La3qjR0H8/cru37U7pUTtnvLsujXxtx/2b7exbIN8Ni9Q3ns3qEA3H3bCO6+bURjC9TMWiO+1bUXtasb+MwsB3yJamb5FEBPv0Q1sxxrjfjmAGdm1fMlqpnlVo/vRTWznGqhtIEOcGZWlexG39aIcA5wZla9JhgppBINf9jezFqPIiqaut1P14mfh0u6VdLT6f9hReuc+NnM6qjSlIEbnvj5dGB6ROwATE+vnfjZzDaF7FnUSqZu99R14udDgKvT/NXAx4qWO/GzmdVZ5Vm1NiTx8+iImJsdJuYCo9JyJ342szqrf+LnUqpO/OwanJlVr455UYF5hdyo6f/5abkTP5vZJlC7Toau/AY4Os0fDfy6aLkTP5tZfamjNjfClUj8/C1gmqTjgBeAw8CJn81sUwhqdqNvmXEj9y+xfXMkfjazfBKV3cTbDBzgzKx6DnBmllsOcGaWSzVsg6s3Bzgzq1qtelHrzQHOzKq0UTfxblIOcGZWncABzsxyrDWuUB3gzKx6vg/OzPLLAc7McikC2lvjGtUBzsyq5xqcmeWWA5yZ5VIAzmxvZvkUEG6DM7M8CtzJYGY5VqM2OEnPAcuBdmBtROwpaTjwc2A88BzwyYhYsiH7d04GM6tebZPOfDAiJhZl3+oy8fOGcIAzsypVGNw2vJZXKvFz1RzgzKw6AXR0VDZ1n/g5gD9IeqBoXanEz1VzG5yZVa/y2ll3iZ/fExEvSxoF3CrpyY0v3DoOcGZWpdo9qhURL6f/50v6JTCJlPg5IuZ2SvxcNV+imll1AiI6KprKkTRQ0uDCPPAh4HFKJ36ummtwZla92jzJMBr4pSTIYtF1EXGzpPvoIvHzhnCAM7Pq1eA+uIh4Ftiji+WLKJH4uVoOcGZWnYhCD2nTc4Azs+p5NBEzy6cg2tsbXYiKOMCZWXU8XJKZ5ZqHSzKzPAogXIMzs1wKD3hpZjnWKp0Miibq7pW0AHi+0eWogxHAwkYXwqqS15/ZthExcmN2IOlmsvNTiYURMWVjjrcxmirA5ZWk+7sZUcGajH9m+eCH7c0stxzgzCy3HOA2jamNLoBVzT+zHHAbnJnllmtwZpZbDnBmllsOcHUkaYqkpyTNkrTBuR1t05F0paT5kh5vdFls4znA1YmkNuBHwEHALsARknZpbKmsAlcBDbsx1WrLAa5+JgGzIuLZiFgN/Iwsoa01sYi4E1jc6HJYbTjA1c9WwJyi1y+mZWa2iTjA1Y+6WOZ7csw2IQe4+nkRGFf0emvg5QaVxaxHcoCrn/uAHSRNkLQZcDhZQlsz20Qc4OokItYCJwO3AE8A0yJiRmNLZd2R9FPgbuDtkl5MyYetRflRLTPLLdfgzCy3HODMLLcc4MwstxzgzCy3HODMLLcc4FqIpHZJD0t6XNL1kgZsxL6ukvTPaf7ycgMBSJosad8NOMZzkt6SfanU8k7brKjyWOdI+vdqy2j55gDXWl6PiIkRsRuwGjipeGUawaRqEXF8RMwss8lkoOoAZ9ZoDnCt68/A21Lt6o+SrgMek9Qm6TuS7pP0qKQTAZT5oaSZkv4XGFXYkaQ7JO2Z5qdIelDSI5KmSxpPFki/lGqP75M0UtIN6Rj3SXpPeu8Wkv4g6SFJ/03Xz+OuR9KvJD0gaYakEzqtuyCVZbqkkWnZ9pJuTu/5s6SdanI2LZec2b4FSepNNs7czWnRJGC3iJidgsSrEbGXpL7AXyT9AXgn8HZgd2A0MBO4stN+RwKXAe9P+xoeEYslXQqsiIjz03bXARdFxF2StiF7WmNn4Gzgrog4V9KHgfUCVgnHpmP0B+6TdENELAIGAg9GxL9J+nra98lkyWBOioinJb0buBjYbwNOo/UADnCtpb+kh9P8n4EryC4d742I2Wn5h4B3FNrXgM2BHYD3Az+NiHbgZUm3d7H/vYE7C/uKiFLjoh0A7CK9WUEbImlwOsbH03v/V9KSCj7TqZIOTfPjUlkXAR3Az9Pya4EbJQ1Kn/f6omP3reAY1kM5wLWW1yNiYvGC9Iu+sngRcEpE3NJpu4PpfrgmVbANZE0b+0TE612UpeJn/yRNJguW+0TEa5LuAPqV2DzScZd2PgdmpbgNLn9uAb4gqQ+ApB0lDQTuBA5PbXRjgA928d67gQ9ImpDeOzwtXw4MLtruD2SXi6TtJqbZO4Ej07KDgGHdlHVzYEkKbjuR1SALegGFWuinyS59lwGzJR2WjiFJe3RzDOvBHODy53Ky9rUHU+KU/yarqf8SeBp4DLgE+FPnN0bEArJ2sxslPcK6S8TfAocWOhmAU4E9UyfGTNb15n4DeL+kB8kulV/opqw3A70lPQqcB9xTtG4lsKukB8ja2M5Ny48Ejkvlm4GHgbcyPJqImeWWa3BmllsOcGaWWw5wZpZbDnBmllsOcGaWWw5wZpZbDnBmllv/Hy+SjYCNVKhDAAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Random Forest Model Accuracy (on testing set): \n", + "0.6307106598984772\n" + ] + } + ], + "source": [ + "rf = RandomForestClassifier(random_state = 1)\n", + "rf_model = rf.fit(X_train, y_train.values.ravel())\n", + "y_pred = rf_model.predict(X_test)\n", + "disp = plot_confusion_matrix(rf_model, X_test, y_test)\n", + "disp.ax_.set_title('Random Forest Confusion Matrix')\n", + "\n", + "plt.show()\n", + "\n", + "print('Random Forest Model Accuracy (on testing set): ')\n", + "print(accuracy_score(y_test, y_pred))" + ] + }, + { + "cell_type": "code", + "execution_count": 98, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAATgAAAEWCAYAAADy2YssAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAAh8ElEQVR4nO3debxcdX3/8df73mxkhZAEsgFBIhpEwEYouDSivxLcwN9P2rhUrFDBotSWFgGtiojSalusiBiRsgkYRQVXwNCItLILyB7WEBKWJARIIMu99/P74/u9YbiZO3cmmbkzc/J+Ph7nceds3/M9Z879zHc5iyICM7Mi6mh2BszMGsUBzswKywHOzArLAc7MCssBzswKywHOzArLAa6gJO0mKSQNaVD6p0g6t2T8fZIel7RG0n6S7pY0pxHbrjdJb5K0OOf98K1I51eSjqxj1gadpF3ycehsdl7qIiK2egDeDPwv8BywCvgf4I31SLueAzAHWFph/snAdWWmTwA2AK/bwu1+FLi+AfvzauCHwIp87O8E/gHoBHYDAhgySMf2IeCwBqY/FjgTWAKsAR7M4xPqkPZC4O8G4zhtYf7Oz9/le/tMPzNP/2iV6TwKvKPZ+zOYw1aX4CSNBX4OfBMYD0wFTgXWb23a9VRlSeYi4CBJM/pMnwf8MSLuqn/OBlYu75JeBdwIPA7sHRHjgCOA2cCYwc0hALsCd29tIv3s6zBSENoLmEsKdgcBK4H9t3ab1CnvDfYAsKl0mI/TEaQflrpoVGm/qerw6zIbWF1h/heBi0vGd6OkZAEsAr4K3EQqhVwBjO+z7MeBZcBy4ISStIaTfsWW5eFMYHieNwdYCnwGeJJU0nkJ6CGVANYAU8rk92rg832m3QQcnz+/G7gdWE0qtb6+ZLnpwI+BZ0j/fGcBrwXWAd15m6vzsuOAC/OyjwGfAzryvI+SSsH/QSoRf7lMPi8GflHhuPc9zn8N3Au8ADwMHFOy7ATSj9TqvL3fleTlM8ATeb37gbeXfq/5O1iTt7UWeKhvaYHUFHIS6Z9xJbCgzHd8FKl0Vq4EfTTwFDC6wv6+lnQurSYFq/eWzDsf+Bbwi7wfNwKvyvMeyufES3k/htOnpEPJOQyMyPu9Mm/rZmCnknP56JJ9/lz+bp/O3/W4Pvt8ZN7nFcBnK+zb+cDXSefxDiXn4a+A68klOOBVwLU5byuA7wPb53kX9dnPE8sd+5JpQ0gFlqXAe3Iao0kl5480u2RWdXza6gTSr+lK4ALg0N4voMYA9wTwOmAUcHnJydS77KV53t6kgND7j/Ml4AZgEjCRFHBOKwlwXcC/5JN2Owaooub1PgQsLhnfk1Q9nQi8IZ+sB5CqgUeS/hmG5/E7SEFpFOkf4c0lAev6Ptu5kBTMx+T9fAA4qmT5LuBT+UTbrkw+nwT+usJ+9D3O78r/AAL+DHgReEOe91XgHGBoHt6Sl9uTVEKcUpJmb2Do+70GsEfJ+KMl39On8/c0LR+r7wCX9snnhfm4ldvXy4ALKuzrUNI/3inAMOBgUiDbsyRArCKV9oaQ/vEvK5fXfsY37StwDPAzYGT+zv8EGFtyLvcGuI/lPO1OCgw/Bi7qs8/fJZ2X+5BqPK/tZ//OB74MzAc+kactAD7AKwPcHsD/ycd4IilgnVlhvzY79mx+3vw56VyblPP7o2YHrZriU10SSb+e55OifRdwJS//qm06Ofr5x1sEnFEyfxYpoJS2I72mZP6/At/Lnx8C3lky7xDg0fx5Tk5nRMn8OQwc4EYCzwMH5fHTgSvy52+TA2jJ8veTAsaBpOC7WZsXfQJc3rf1wKySaccAi0qWXzJAPjcCcyvMf8VxLjP/p+R2J9IPxRWUBKiSf5ingXcAQ/vM6/u9Vgpw95JLfnl8cs7/kJJ87l5hX64pPUfKzH8L6Z+wo2TapcAX8+fzgXNL5r0TuK9cXvsZ37SvpMD1ipJ7yXKLeDnALQT+tmTenmX2eVrJ/JuAef3s3/mkAPdm4Pek0v9TpIC0KcCVWe9w4A8V9muzY1/uvCE1P/2RVEvacaB40EpDXXpRI+LeiPhoREwjlcSmkKqL1Xq85PNjpF/kCRXmT8mfp+TxcvMAnomIdTXkg4h4kVSd/YgkkUp0F+TZuwInSFrdO5CqpVPy38cioquKzUwglTT65n1qyfjjVLaSFCiqIulQSTdIWpXz/U5ePsZfI5U2rpb0sKSTACLiQVLp64vA05IukzRls8QHtivwk5Jjdi+pyr5TyTKV9negfZ0CPB4RPSXT+h7PJ0s+v0gqVW2Ji4CrgMskLZP0r5KG9pOnvt/vEF65zzXlKSKuJ5XMPgf8PCJeKp0vaVL+jp6Q9DypKj2hTFJ9DXSuzSf9X/9XRKysIr2WUffLRCLiPtIvzuvypLWkUlGvncusNr3k8y6kX7oVFeYvy5+Xkf55ys2D9EtEhfH+XAD8Bam4P4bUPgXpRDg9IrYvGUZGxKV53i79NNT23e4K0j72zfsTNeT1N8D/q2ZnJA0nVf2/TipZbw/8klQNJSJeiIgTImJ34D3AP0h6e553SUS8Oec1SFX+Wj0OHNrnuI2IiGr39zfAIZJG9TN/GTBdUun53Pd41qLfczYiNkbEqRExi9TR8W7gI/3kqe/320UqeW2Ni4ETSNXKvr5KOo6vj4ixwIfJ33Fv9vtJs99jny8X+U7e3ick7bElmW6WevSivkbSCZKm5fHppLaBG/IitwNvzdfXjCNditHXhyXNkjSSVF36UUR0l8z/Z0kjJe1Faiz/QZ5+KfA5SRMlTQA+TzoB+vMUsGPORyW/IzUgzye11WzI078LHCvpACWjJL1L0hhSFWM5cEaePkLSm0q2Oy33BpL3bQFwuqQxknYlXd5RKe99fYHU4/s1STsDSNpD0sWStu+z7DBSu8wzQJekQ0ltK+T13p3XFal63g10S9pT0sE5QK4jNVB3U7tz8r7umrc3UdJhNax/ESlIXp7Ptw5JO+Zr8d5J6jRYC5woaWi+/u49pLa7LXE7MC+nNRt4f+8MSW+TtHf+x3+e9ENV7phcCvy9pBmSRgNfAX5QZQm/kv8k/fBeV2beGHJHlqSpwD/1mf8UqU2wFqfkvx8j/UBe2E7XyNWjBPcCqdH9RklrSYHtLtKvDBFxDSkg3QncysuloVIXkUp9T5Ia54/vM/+3pCrUQuDrEXF1nv5l4Jac9h+B2/K0snLp8lLg4VxdKlvditTwcCHpF/jCkum3AH9D6h19Nufpo3leN+mfag9Sj9RS4C/zqteSevaelNRbMv0U6Z/yYVI7yiXAef3lvUweHyK1++0G3C3pOVIp7RbSd1K67AukY7og5/uDpHbSXjNJpaQ1pDaesyNiESkonkEqcfY2NJ9C7b6Rt3e1pBdI58gB1a4cEetJ7YD3kdrjnif9oEwAbsw/QO8ldXKtAM4m9fTdtwV5BfhnUofMs6RLni4pmbcz8KOch3tJ52a5H6bzSOf1dcAjpB+IT21hfjaJiFURsTCfo32dSuoIe47UY/zjPvO/SioQrJb0jwNtS9KfkH54P5LP738hlfZO2pp9GEwqf5wGMQPSIlID7rll5u1GOjmG1uGXz8y2Mb5Vy8wKywHOzAqr6VVUM7NGcQnOzAqrpW6unTC+M3abXu6aSWtVD9w5cuCFrGWsYy0bYr0GXrJ/h7xtVKxcVd3VQrfeuf6qiJi7NdvbGi0V4HabPpSbrpo+8ILWMg6Zsm+zs2A1uDEWbnUaK1d1c9NVu1S1bOfkxdXcSdEwLRXgzKz1BdBDz4DLtQIHODOrSRBsjC25oWXwOcCZWc1cgjOzQgqC7ja5vMwBzsxq1lP1g3maywHOzGoSQLcDnJkVlUtwZlZIAWx0G5yZFVEQrqKaWUEFdLdHfHOAM7PapDsZ2oMDnJnVSHSzVffrDxoHODOrSepkcIAzswJK18E5wJlZQfW4BGdmReQSnJkVViC62+RtBw5wZlYzV1HNrJACsSE6m52NqjjAmVlN0oW+rqKaWUG5k8HMCilCdIdLcGZWUD0uwZlZEaVOhvYIHe2RSzNrGe5kMLNC6/Z1cGZWRL6TwcwKrce9qGZWROlmewc4MyugQGz0rVpmVkQRtM2Fvu2RSzNrIaKnyqGq1KROSX+Q9PM8Pl7SNZIW5787lCx7sqQHJd0v6ZCB0naAM7OaBKkEV81Qpb8D7i0ZPwlYGBEzgYV5HEmzgHnAXsBc4GxJFevKDnBmVrNuOqoaBiJpGvAu4NySyYcBF+TPFwCHl0y/LCLWR8QjwIPA/pXSdxucmdUkUC0PvJwg6ZaS8fkRMb9k/EzgRGBMybSdImI5QEQslzQpT58K3FCy3NI8rV8OcGZWk/TawKpDx4qImF1uhqR3A09HxK2S5lSRVrmoGpVWcIAzsxrV7cXPbwLeK+mdwAhgrKSLgackTc6lt8nA03n5pcD0kvWnAcsqbcBtcGZWkyDdyVDNUDGdiJMjYlpE7EbqPLg2Ij4MXAkcmRc7Ergif74SmCdpuKQZwEzgpkrbcAnOzGrW4Cf6ngEskHQUsAQ4AiAi7pa0ALgH6AKOi4juSgk5wJlZTSJU93tRI2IRsCh/Xgm8vZ/lTgdOrzZdBzgzq0nqZPCtWmZWSH4ng5kVVOpk8AMvzayg/LgkMyukGu9kaCoHODOrmV86Y2aFFAEbexzgzKyAUhXVAc7MCqrBdzLUjQNcnXR3w6fmvpodJ2/ktAsf4fRjdmXpQyMAWPt8J6PGdvPt39zPxg3iGydOY/GdI1EHfOJLT7DPQWuanHvrNXvO8xx72jI6O4JfXTqeBWft1OwstRxfJpJJmgt8A+gEzo2IMxq5vWb66bkTmT5zPS+uSUX3z37nsU3zvnPqFEaNSbfM/er7O6Zp197P6hVD+OyHduebv3qAjvYo8RdaR0dw3Fee4OR5u7Ni+VC++cvF3HDVOJYsHtHsrLWY9qmiNiyX+VHC3wIOBWYBH8iPHC6cZ5YN5aaFYzn0gys3mxcB1125PW87/FkAljwwnP3ekkps20/oYvS4bh64Y+Sg5tfK23O/F1n26DCeXDKcro0dLLpiew485LlmZ6sl1fOdDI3UyDC8P/BgRDwcERuAy0iPHC6cc74wlaM/twyVOZp33TiKHSZ2MXX3DQDsvtc6fn/VOLq74Mklw1h850ieWTZ0kHNs5ey480aeWTZs0/iK5UOZMHljE3PUmlIvamdVQ7M1soo6FXi8ZHwpcEDfhSR9HPg4wC5T269J8IZrxrL9hC5mvv4l7vjf0ZvN/++f7sCcXHoDOGTeSpYsHs4n5+7JpGkbmDV7LZ2dFR9KaoNEZQoc4a9mM77QN6nq8cL5+ezzAWbvM6LtTqd7bh7FDVeP5eaFs9iwXrz4Qif/8sld+MxZS+jugv/55TjO+vUDm5bvHALHnvryQ0g//Z6ZTN19fTOybn2sWD6UiVM2bBqfMHkjK5906bqcVqh+VqORAa7mxwu3o4+dspyPnbIcgDv+dzQ/OmcinzlrCQC3/W4M0/dYz8QpL1dz1r0oQIwY2cOtvx1N55Bg11c7wLWC+28fydQZG9hp+npWPjmUOYet5ozjdm12tlqOe1GTm4GZ+dHCT5AeSfzBBm6v5fz2ildWTwFWrxzKZz+wO+pIbT4nfvOxfta2wdbTLb712al85ZKH6eiEqy8bz2MPuAe1nHbpRW1YgIuILkmfBK4iXSZyXkTc3ajttYJ9Dlrzimva/vHMJZsts/P0DXzv+vsGM1tWg5uvHcvN145tdjZaWoTo2tYDHEBE/BL4ZSO3YWaDz1VUMyskt8GZWaE5wJlZIfk6ODMrNF8HZ2aFFAFdfuClmRWVq6hmVkhugzOzQgsHODMrKncymFkhRbgNzswKS3S7F9XMisptcGZWSL4X1cyKK9rnUe4OcGZWM/eimlkhhTsZzKzI2qWK2h5h2MxaSoSqGiqRNELSTZLukHS3pFPz9PGSrpG0OP/doWSdkyU9KOl+SYcMlE8HODOrSUR9AhywHjg4IvYB9gXmSvpT4CRgYUTMBBbmcSTNIr28ai9gLnC2pIpvl3aAM7Oa9YSqGiqJpPctTUPzEMBhwAV5+gXA4fnzYcBlEbE+Ih4BHgT2r7QNBzgzq1lEdcNAJHVKuh14GrgmIm4EdoqI5Wk7sRyYlBefCjxesvrSPK1f7mQws5oEoqf6XtQJkm4pGZ8fEfM3pRXRDewraXvgJ5JeVyGtckXCimHUAc7MalZDJ+qKiJg9YHoRqyUtIrWtPSVpckQslzSZVLqDVGKbXrLaNGBZpXRdRTWz2tSpk0HSxFxyQ9J2wDuA+4ArgSPzYkcCV+TPVwLzJA2XNAOYCdxUaRsuwZlZ7epzHdxk4ILcE9oBLIiIn0v6PbBA0lHAEuAIgIi4W9IC4B6gCzguV3H75QBnZjWrx9NEIuJOYL8y01cCb+9nndOB06vdRr8BTtI3qRCnI+L4ajdiZsURQE9P+9+LekuFeWa2rQqg3R+XFBEXlI5LGhURaxufJTNrdYW5F1XSgZLuAe7N4/tIOrvhOTOz1hVVDk1WzWUiZwKHACsBIuIO4K0NzJOZtbTqLhFphceaV9WLGhGPS6/IbMWuWTMruBYonVWjmgD3uKSDgJA0DDieXF01s21QQLRJL2o1VdRjgeNIN7U+QXqsyXENzJOZtTxVOTTXgCW4iFgBfGgQ8mJm7aJNqqjV9KLuLulnkp6R9LSkKyTtPhiZM7MWVaBe1EuABaT7xqYAPwQubWSmzKyF9V7oW83QZNUEOEXERRHRlYeLaYnYbGbNUq8HXjZapXtRx+eP/y3pJOAyUmD7S+AXg5A3M2tVbdKLWqmT4VZSQOvdk2NK5gVwWqMyZWatTS1QOqtGpXtRZwxmRsysTbRIB0I1qrqTIT8nfRYwondaRFzYqEyZWStrjQ6EagwY4CR9AZhDCnC/BA4Frgcc4My2VW1SgqumF/X9pKdrPhkRfw3sAwxvaK7MrLX1VDk0WTVV1JciokdSl6SxpDfc+EJfs21VER54WeKW/Oab75J6VtcwwJtszKzY2r4XtVdE/G3+eI6kXwNj88sizGxb1e4BTtIbKs2LiNsakyUzs/qoVIL7twrzAji4znnhnhfHs9/N8+qdrDXQJO5rdhasCdq+ihoRbxvMjJhZmwgKcauWmVl57V6CMzPrT9tXUc3M+tUmAa6aJ/pK0oclfT6P7yJp/8ZnzcxaVoGe6Hs2cCDwgTz+AvCthuXIzFqaovqh2aqpoh4QEW+Q9AeAiHg2vz7QzLZVBepF3Sipk1zglDSRlriN1syapRVKZ9Wopor6n8BPgEmSTic9KukrDc2VmbW2NmmDq+Ze1O9LupX0yCQBh0eE32xvtq1qkfa1alTzwMtdgBeBn5VOi4gljcyYmbWwogQ40hu0el8+MwKYAdwP7NXAfJlZC1ObtMJXU0Xdu3Q8P2XkmH4WNzNrGTXfyRARt0l6YyMyY2ZtoihVVEn/UDLaAbwBeKZhOTKz1lanTgZJ00kvr9qZdOnZ/Ij4Rn7p/A+A3YBHgb+IiGfzOicDRwHdwPERcVWlbVRzmciYkmE4qU3usC3YHzMrivpcJtIFnBARrwX+FDhO0izgJGBhRMwEFuZx8rx5pPb/ucDZ+RrdflUsweWVR0fEPw2YVTPbdtShBBcRy4Hl+fMLku4FppIKUHPyYhcAi4DP5OmXRcR64BFJDwL7A7/vbxuVHlk+JCK6Kj263My2PaKmXtQJkm4pGZ8fEfM3S1PaDdgPuBHYKQc/ImK5pEl5sanADSWrLc3T+lWpBHcTqb3tdklXAj8E1vbOjIgfV0rYzAqqtja4FRExu9ICkkYDlwOfjojnpX7vcy03o2JOqulFHQ+sJL2Dofd6uAAc4My2VXXqRZU0lBTcvl9SaHpK0uRceptMehczpBLb9JLVpwHLKqVfKcBNyj2od/FyYOvVJp3EZtYQ9elFFfA94N6I+PeSWVcCRwJn5L9XlEy/RNK/A1OAmQzwjuZKAa4TGM0WFAvNrNjqdC/qm4C/Av4o6fY87RRSYFsg6ShgCXAEQETcLWkBcA+pB/a4iOiutIFKAW55RHxp6/JvZoVUn17U6ylfgIL0cI9y65wOnF7tNioFuPZ4op2ZDa4oxr2oZSOomVm7NFJVevHzqsHMiJm1j8I8D87MbDMOcGZWSC3yOPJqOMCZWU2Eq6hmVmAOcGZWXA5wZlZYDnBmVkhFem2gmdlmHODMrKiKcKuWmVlZrqKaWTH5Ql8zKzQHODMrIt/JYGaFpp72iHAOcGZWG7fBmVmRuYpqZsXlAGdmReUSnJkVlwOcmRVSQd6qZWa2GV8HZ2bFFu0R4RzgzKxmLsFtKzb0sMMpS2BjoO5g/UFjWPvBieiFbsZ97Qk6nt5Iz6ShPHfiVGJ0J2wMxpz9JEMfWkcI1hw9iY17j2r2Xlg2e87zHHvaMjo7gl9dOp4FZ+3U7Cy1nja60LejUQlLOk/S05LuatQ2WsJQsfq0XXj2GzNYdeYMht22liH3v8TIy1ey4fWjWHXOq9jw+lGMvHwlANtdvRqAVf85g9WnTmf0fz0NbXLbS9F1dATHfeUJPvehGfzNnD1522Gr2WXmumZnqyWpp7qh2RoW4IDzgbkNTL81SMR2+TB2RxqA4TeuYd3B4wBYd/A4ht+wBoDOx9ezYZ+RAMT2Q4hRnQx50P9ErWDP/V5k2aPDeHLJcLo2drDoiu058JDnmp2tlrTNB7iIuA5Y1aj0W0p3sMOnH2HCRxazYd9RdO25HR3PddEzPrUA9IwfQsdzXQB0zRjB8BvXQHfQ8dQGhjy0js4VG5uZe8t23Hkjzywbtml8xfKhTJjs72YzQepkqGZosqa3wUn6OPBxgKETxzY5N1uoUzx75gy0pptxX32CzsfW97vouneMY8jj69nhhEfpmTiUja/ZjujUIGbW+qMyX0ML/I+2JHcyVCki5gPzAUbOnNImh628GN3Jhr1HMuy2NfSMG0LHqlSK61jVRc+4fKg7xZqjX2643uHEx+iePKyfFG0wrVg+lIlTNmwanzB5IyufHNrEHLWwNvlPbWQb3DZBz3WhNd1pZH0Pw+5YS/e04azffzQjrk3tNyOufY71B4zetAzrUuPE0NvXEp3QvcvwZmTd+rj/9pFMnbGBnaavZ8jQHuYctpobrh7X7Gy1nN4LfasZmq3pJbh21/FsF2PPXJ4aVCNY96axbHjjaDbuuR3jvvYEI36zmp6J6TIRgI7VXWz/xaXQkdrmnv/7Kc3dAdukp1t867NT+colD9PRCVdfNp7HHhjR7Gy1ngg/8FLSpcAcYIKkpcAXIuJ7jdpes3TvNoJnz5yx2fQY28nq03bZbHrPTsNY9e3dByNrtgVuvnYsN1/bpm3Bg6k94lvjAlxEfKBRaZtZc7VC9bMaboMzs9oE6eL0aoYBlLshQNJ4SddIWpz/7lAy72RJD0q6X9IhA6XvAGdmtYsqh4Gdz+Y3BJwELIyImcDCPI6kWcA8YK+8ztmSOisl7gBnZjWrVy9qPzcEHAZckD9fABxeMv2yiFgfEY8ADwL7V0rfvahmVrMaelEnSLqlZHx+vva1kp0iYjlARCyXNClPnwrcULLc0jytXw5wZlab2p4msiIiZtdpy+Vu+amYE1dRzawm6ULfqGrYQk9JmgyQ/z6dpy8FppcsNw1YVikhBzgzq11PlcOWuRI4Mn8+EriiZPo8ScMlzQBmAjdVSshVVDOr2VaUzl6ZTpkbAoAzgAWSjgKWAEcARMTdkhYA9wBdwHER0V0pfQc4M6tNHZ/oW+GGgLf3s/zpwOnVpu8AZ2Y18r2oZlZkbfKgPAc4M6uNX/xsZoXmEpyZFVZ7xDcHODOrnXrao47qAGdmtQm25iLeQeUAZ2Y1EVt1G9agcoAzs9o5wJlZYTnAmVkhuQ3OzIrMvahmVlDhKqqZFVTgAGdmBdYeNVQHODOrna+DM7PicoAzs0KKgO72qKM6wJlZ7VyCM7PCcoAzs0IKwO9kMLNiCgi3wZlZEQXuZDCzAnMbnJkVlgOcmRWTb7Y3s6IKwI9LMrPCcgnOzIrJt2qZWVEFhK+DM7PC8p0MZlZYboMzs0KKcC+qmRWYS3BmVkxBdHc3OxNVcYAzs9r4cUlmVmhtcplIR7MzYGbtJYDoiaqGgUiaK+l+SQ9KOqneeXWAM7PaRH7gZTVDBZI6gW8BhwKzgA9ImlXPrLqKamY1q1Mnw/7AgxHxMICky4DDgHvqkTiAooW6eyU9AzzW7Hw0wARgRbMzYTUp6ne2a0RM3JoEJP2adHyqMQJYVzI+PyLm53TeD8yNiKPz+F8BB0TEJ7cmf6VaqgS3tQe+VUm6JSJmNzsfVj1/Z/2LiLl1Skrlkq9T2oDb4MyseZYC00vGpwHL6rkBBzgza5abgZmSZkgaBswDrqznBlqqilpg85udAauZv7MGi4guSZ8ErgI6gfMi4u56bqOlOhnMzOrJVVQzKywHODMrLAe4Bmr0bShWf5LOk/S0pLuanRfbeg5wDTIYt6FYQ5wP1Os6L2syB7jG2XQbSkRsAHpvQ7EWFhHXAauanQ+rDwe4xpkKPF4yvjRPM7NB4gDXOA2/DcXMKnOAa5yG34ZiZpU5wDVOw29DMbPKHOAaJCK6gN7bUO4FFtT7NhSrP0mXAr8H9pS0VNJRzc6TbTnfqmVmheUSnJkVlgOcmRWWA5yZFZYDnJkVlgOcmRWWA1wbkdQt6XZJd0n6oaSRW5HW+fmtRkg6t9KDACTNkXTQFmzjUUmbvX2pv+l9lllT47a+KOkfa82jFZsDXHt5KSL2jYjXARuAY0tn5ieY1Cwijo6ISu+inAPUHODMms0Brn39Dtgjl67+W9IlwB8ldUr6mqSbJd0p6RgAJWdJukfSL4BJvQlJWiRpdv48V9Jtku6QtFDSbqRA+ve59PgWSRMlXZ63cbOkN+V1d5R0taQ/SPoO5e/HfQVJP5V0q6S7JX28z7x/y3lZKGlinvYqSb/O6/xO0mvqcjStkPzSmTYkaQjpOXO/zpP2B14XEY/kIPFcRLxR0nDgfyRdDewH7AnsDexEenv4eX3SnQh8F3hrTmt8RKySdA6wJiK+npe7BPiPiLhe0i6kuzVeC3wBuD4iviTpXcArAlY/Ppa3sR1ws6TLI2IlMAq4LSJOkPT5nPYnSS+DOTYiFks6ADgbOHgLDqNtAxzg2st2km7Pn38HfI9UdbwpIh7J0/8ceH1v+xowDpgJvBW4NCK6gWWSri2T/p8C1/WmFRH9PRftHcAsaVMBbaykMXkb/zev+wtJz1axT8dLel/+PD3ndSXQA/wgT78Y+LGk0Xl/f1iy7eFVbMO2UQ5w7eWliNi3dEL+R19bOgn4VERc1We5dzLw45pUxTKQmjYOjIiXyuSl6nv/JM0hBcsDI+JFSYuAEf0sHnm7q/seA7P+uA2ueK4CPiFpKICkV0saBVwHzMttdJOBt5VZ9/fAn0makdcdn6e/AIwpWe5qUnWRvNy++eN1wIfytEOBHQbI6zjg2RzcXkMqQfbqAHpLoR8kVX2fBx6RdETehiTtM8A2bBvmAFc855La127LL075Dqmk/hNgMfBH4NvAb/uuGBHPkNrNfizpDl6uIv4MeF9vJwNwPDA7d2Lcw8u9uacCb5V0G6mqvGSAvP4aGCLpTuA04IaSeWuBvSTdSmpj+1Ke/iHgqJy/u/Fj4K0CP03EzArLJTgzKywHODMrLAc4MyssBzgzKywHODMrLAc4MyssBzgzK6z/Dxtehp8ENcR2AAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "SVC Model Accuracy (on testing set): \n", + "0.6078680203045685\n" + ] + } + ], + "source": [ + "svc = SVC(probability = True)\n", + "svc_model = svc.fit(X_train, y_train.values.ravel())\n", + "y_pred = svc_model.predict(X_test)\n", + "disp = plot_confusion_matrix(svc_model, X_test, y_test)\n", + "disp.ax_.set_title('Support Vector Classifier Confusion Matrix')\n", + "\n", + "plt.show()\n", + "\n", + "print('SVC Model Accuracy (on testing set): ')\n", + "print(accuracy_score(y_test, y_pred))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Prediction Time" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# UFC 259: Blachowicz vs Adesanya\n", + "\n", + "### Title fights (5 rounds):\n", + "Jan Blachowicz vs Israel Adesanya\n", + "\n", + "Amanda Nunes vs Megan Anderson\n", + "\n", + "Petr Yan vs Aljamain Sterling\n", + "\n", + "### 3 round fights:\n", + "Islam Makhachev vs Drew Dober\n", + "\n", + "Thiago Santos vs Aleksandar Rakic\n", + "\n", + "\n", + "## The Stats: ([https://www.espn.co.uk/mma/fightcenter/_/id/600001860/league/ufc](http://))\n", + "### Fight 1\n", + "#### Jan Blachowicz (Blue):\n", + "* Current Lose Streak: 0\n", + "* Current Win Streak: 4\n", + "* Draws: 0\n", + "* Losses: 8\n", + "* Wins: 27\n", + "* Stance: Orthodox\n", + "* Height: 188\n", + "* Reach: 198\n", + "* Weight: 205\n", + "\n", + "#### Israel Adesanya (Red):\n", + "* Current Lose Streak: 0\n", + "* Current Win Streak: 20\n", + "* Draws: 0\n", + "* Losses: 0\n", + "* Wins: 20\n", + "* Stance: Switch\n", + "* Height: 193\n", + "* Reach: 203\n", + "* Weight: 193 (speculation based on interview, weigh-ins to come)\n", + "\n", + "\n", + "### Fight 2\n", + "#### Amanda Nunes (Blue):\n", + "* Current Lose Streak: 0\n", + "* Current Win Streak: 11\n", + "* Draws: 0\n", + "* Losses: 4\n", + "* Wins: 20\n", + "* Stance: Orthodox\n", + "* Height: 173\n", + "* Reach: 165\n", + "* Weight: 145\n", + "\n", + "#### Megan Anderson (Red):\n", + "* Current Lose Streak: 0\n", + "* Current Win Streak: 2\n", + "* Draws: 0\n", + "* Losses: 4\n", + "* Wins: 11\n", + "* Stance: Orthodox\n", + "* Height: 183\n", + "* Reach: 183\n", + "* Weight: 145\n", + "\n", + "### Fight 3\n", + "#### Petr Yan (Blue):\n", + "* Current Lose Streak: 0\n", + "* Current Win Streak: 10\n", + "* Draws: 0\n", + "* Losses: 1\n", + "* Wins: 15\n", + "* Stance: Switch\n", + "* Height: 170\n", + "* Reach: 170\n", + "* Weight: 135\n", + "\n", + "#### Aljamain Sterling (Red):\n", + "* Current Lose Streak: 0\n", + "* Current Win Streak: 5\n", + "* Draws: 0\n", + "* Losses: 3\n", + "* Wins: 19\n", + "* Stance: Orthodox\n", + "* Height: 170\n", + "* Reach: 180\n", + "* Weight: 135\n", + "\n", + "### Fight 4\n", + "#### Islam Makhachev (Blue):\n", + "* Current Lose Streak: 0\n", + "* Current Win Streak: 6\n", + "* Draws: 0\n", + "* Losses: 1\n", + "* Wins: 18\n", + "* Stance: Orthodox\n", + "* Height: 178\n", + "* Reach: 178\n", + "* Weight: 155\n", + "\n", + "#### Drew Dober (Red):\n", + "* Current Lose Streak: 0\n", + "* Current Win Streak: 3\n", + "* Draws: 0\n", + "* Losses: 9\n", + "* Wins: 23\n", + "* Stance: Southpaw\n", + "* Height: 173\n", + "* Reach: 178\n", + "* Weight: 155\n", + "\n", + "### Fight 5\n", + "#### Thiago Santos (Blue):\n", + "* Current Lose Streak: 2\n", + "* Current Win Streak: 0\n", + "* Draws: 0\n", + "* Losses: 8\n", + "* Wins: 21\n", + "* Stance: Orthodox\n", + "* Height: 188\n", + "* Reach: 193\n", + "* Weight: 205\n", + "\n", + "#### Aleksandar Rakic (Red):\n", + "* Current Lose Streak: 0\n", + "* Current Win Streak: 1\n", + "* Draws: 0\n", + "* Losses: 2\n", + "* Wins: 13\n", + "* Stance: Orthodox\n", + "* Height: 193\n", + "* Reach: 198\n", + "* Weight: 205" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With all the stats available to us, it can create a data frame to feed into our models and get predictions" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "columns = X.columns" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fight1 = [5, 0, 4, 0, 8, 27, 0, 188, 198, 205, 0, 20, 0, 0, 20, 1, 193, 293, 193]\n", + "fight2 = [5, 0, 11, 0, 4, 20, 0, 173, 165, 145, 0, 2, 0, 4, 11, 0, 183, 183, 145]\n", + "fight3 = [5, 0, 10, 0, 1, 15, 1, 170, 170, 135, 0, 5, 0, 3, 19, 0, 170, 180, 135]\n", + "fight4 = [3, 0, 6, 0, 1, 18, 0, 178, 178, 155, 0, 3, 0, 9, 23, 2, 173, 178, 155]\n", + "fight5 = [3, 2, 0, 0, 8, 21, 0, 188, 193, 205, 0, 1, 0, 2, 13, 0, 193, 198, 205]\n", + "\n", + "df1 = pd.DataFrame(np.array([fight1, fight2, fight3, fight4, fight5]), columns = columns)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Model Predictions: (0 indicates Red Fighter Wins, 1 indicates Blue Fighter Wins)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Support Vector Classifier:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "svc_model.predict(df1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Predicted Winners:\n", + "* Fight 1 - Israel Adesanya\n", + "* Fight 2 - Megan Anderson\n", + "* Fight 3 - Aljamain Sterling\n", + "* Fight 4 - Drew Dober\n", + "* Fight 5 - Aleksandar Rakic" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Random Forest:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "rf_model.predict(df1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Predicted Winners:\n", + "* Fight 1 - Israel Adesanya\n", + "* Fight 2 - Megan Anderson\n", + "* Fight 3 - Aljamain Sterling\n", + "* Fight 4 - Islam Makhachev\n", + "* Fight 5 - Aleksandar Rakic" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Logistic Regression:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "lr_model.predict(df1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Predicted Winners:\n", + "* Fight 1 - Israel Adesanya\n", + "* Fight 2 - Megan Anderson\n", + "* Fight 3 - Aljamain Sterling\n", + "* Fight 4 - Drew Dober\n", + "* Fight 5 - Aleksandar Rakic" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Full Feature Modeling\n", + "We'll reduce the problem to a binary classification problem, especially with the new scoring system draws don't occur in the UFC anymore." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df['Winner'].value_counts()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df = df[df['Winner'] != 'Draw']" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df = df.drop(columns=['R_fighter', 'B_fighter', 'Referee', 'date', 'location', 'title_bout', 'weight_class', 'B_draw', 'R_draw'])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "mapping = {'Orthodox': 0, 'Switch': 1, 'Southpaw': 2, 'Open Stance': 3}\n", + "df['B_Stance'] = df['B_Stance'].replace(mapping)\n", + "df['R_Stance'] = df['R_Stance'].replace(mapping)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "X = df.drop(columns=['Winner'])\n", + "Y = df['Winner']\n", + "mapping = {'Red': 0, 'Blue': 1}\n", + "Y = Y.replace(mapping)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=42)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "seed = 404\n", + "np.random.seed(seed)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "gnb = GaussianNB()\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(gnb, X_train, y_train.values.ravel(), cv=kfold)\n", + "print('Gaussian Naive Bayes K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('Gaussian Naive Bayes Average Score:')\n", + "print(cv_score.mean())\n", + "print()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "lr = LogisticRegression(max_iter = 10000)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(lr, X_train, y_train.values.ravel(), cv=kfold)\n", + "print('Logistic Regression K-fold Scores (training):')\n", + "print(cv_score)\n", + "print()\n", + "print('Logistic Regression Average Score:')\n", + "print(cv_score.mean())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dt = tree.DecisionTreeClassifier(random_state = 1)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(dt, X_train, y_train.values.ravel(), cv=kfold)\n", + "print('Decision Tree K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('Decision Tree Average Score:')\n", + "print(cv_score.mean())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "knn = KNeighborsClassifier()\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(knn, X_train, y_train.values.ravel(), cv=kfold)\n", + "print('KNN K-fold Scores):')\n", + "print(cv_score)\n", + "print()\n", + "print('KNN Average Score:')\n", + "print(cv_score.mean())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "rf = RandomForestClassifier(random_state = 1)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(rf, X_train, y_train.values.ravel(), cv=kfold)\n", + "print('Random Forest K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('Random Forest Average Score:')\n", + "print(cv_score.mean())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "svc = SVC(probability = True)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(svc, X_train, y_train.values.ravel(), cv=kfold)\n", + "print('Support Vector Classification K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('Support Vector Classification Average Score:')\n", + "print(cv_score.mean())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "xgb = XGBClassifier(objective='binary:logistic',random_state =1, use_label_encoder=False)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(xgb, X_train, y_train.values.ravel(), cv=kfold)\n", + "print('XGBoost Classifier K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('XGBoost Classifier Average Score:')\n", + "print(cv_score.mean())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from keras.wrappers.scikit_learn import KerasClassifier\n", + "from keras.utils import np_utils\n", + "\n", + "encoder = LabelEncoder()\n", + "encoder.fit(y_train)\n", + "encoded_Y = encoder.transform(y_train)\n", + "y_Train = np_utils.to_categorical(encoded_Y)\n", + "\n", + "encoder = LabelEncoder()\n", + "encoder.fit(y_test)\n", + "y_Test = encoder.transform(y_test)\n", + "\n", + "\n", + "def create_model():\n", + " model = Sequential()\n", + " \n", + " model.add(Dense(X_train.shape[1], input_dim=X_train.shape[1], activation='relu'))\n", + " model.add(Dense(X_train.shape[1]*2, activation='tanh'))\n", + " model.add(Dense(X_train.shape[1]*4, activation='tanh'))\n", + " model.add(Dense(X_train.shape[1]*2, activation='tanh')) \n", + " model.add(Dense(X_train.shape[1], activation='relu'))\n", + " model.add(Dense(2, activation='softmax'))\n", + "\n", + " model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])\n", + " return model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "seed = 7\n", + "np.random.seed(seed)\n", + "\n", + "model = KerasClassifier(build_fn=create_model, epochs=150, batch_size=10, verbose=0)\n", + "\n", + "\n", + "kfold = KFold(n_splits=10, shuffle=True)\n", + "results = cross_val_score(model, X_train, y_Train, cv=kfold)\n", + "print('Neural Network K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('Neural Network Average Score:')\n", + "print(cv_score.mean())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "lr = LogisticRegression(max_iter = 2000)\n", + "lr_model = lr.fit(X_train, y_train.values.ravel())\n", + "y_pred = lr_model.predict(X_test)\n", + "\n", + "disp = plot_confusion_matrix(lr_model, X_test, y_test)\n", + "disp.ax_.set_title('Logistic Regression Confusion Matrix')\n", + "\n", + "plt.show()\n", + "print('Logistic Regression Model Accuracy (on testing set): ')\n", + "print(accuracy_score(y_test, y_pred))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "rf = RandomForestClassifier(random_state = 1)\n", + "rf_model = rf.fit(X_train, y_train.values.ravel())\n", + "y_pred = rf_model.predict(X_test)\n", + "disp = plot_confusion_matrix(rf_model, X_test, y_test)\n", + "disp.ax_.set_title('Random Forest Confusion Matrix')\n", + "\n", + "plt.show()\n", + "\n", + "print('Random Forest Model Accuracy (on testing set): ')\n", + "print(accuracy_score(y_test, y_pred))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "svc = SVC(probability = True)\n", + "svc_model = svc.fit(X_train, y_train.values.ravel())\n", + "y_pred = svc_model.predict(X_test)\n", + "disp = plot_confusion_matrix(svc_model, X_test, y_test)\n", + "disp.ax_.set_title('Support Vector Classifier Confusion Matrix')\n", + "\n", + "plt.show()\n", + "\n", + "print('SVC Model Accuracy (on testing set): ')\n", + "print(accuracy_score(y_test, y_pred))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "xgb = XGBClassifier(objective='binary:logistic',random_state =1, use_label_encoder=False)\n", + "kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\n", + "cv_score = cross_val_score(xgb, X_train, y_train.values.ravel(), cv=kfold)\n", + "print('XGBoost Classifier K-fold Scores:')\n", + "print(cv_score)\n", + "print()\n", + "print('XGBoost Classifier Average Score:')\n", + "print(cv_score.mean())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": { + "jp-MarkdownHeadingCollapsed": true + }, + "source": [ + "# Conclusion\n", + "\n", + "If we look at the current under/over odds in the betting world, most agree with the model predictions for Fight 1 and Fight 5, but only the Random Forest is in line with the odds for Fight 4. Fight 3 has the odds at -110 to -110, so Vegas seems to be split evenly on this fight.\n", + "\n", + "I hope to expand the models to take in more features, and as a fan I can't help but think strike %, take down %, take down defence %, and a number of the various other features definitely come into play when accessing winner outcome.\n", + "\n", + "I hope you enjoyed this!" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.7" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} From 14159da7083687d43a21dad9175e10b6f034aea0 Mon Sep 17 00:00:00 2001 From: VARUNSHIYAM <138989960+Varunshiyam@users.noreply.github.com> Date: Tue, 5 Nov 2024 19:44:00 +0530 Subject: [PATCH 2/2] Create Readme.md --- .../MMA_Fight_prediction/Readme.md | 45 +++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 Prediction Models/MMA_Fight_prediction/Readme.md diff --git a/Prediction Models/MMA_Fight_prediction/Readme.md b/Prediction Models/MMA_Fight_prediction/Readme.md new file mode 100644 index 00000000..4ef4300b --- /dev/null +++ b/Prediction Models/MMA_Fight_prediction/Readme.md @@ -0,0 +1,45 @@ +# MMA Fight Prediction Model + +This repository contains a machine learning model designed to predict the outcome of MMA fights. Using historical data and various fighter statistics, the model aims to determine the probability of each fighter winning a given matchup. + +## Table of Contents +- [Introduction](#introduction) +- [Problem Statement](#problem-statement) +- [Model Overview](#model-overview) +- [Data](#data) +- [Installation](#installation) +- [Usage](#usage) +- [Results](#results) +- [Contributing](#contributing) +- [License](#license) + +## Introduction + +Predicting the outcome of MMA fights is challenging due to the high variability of the sport. Factors such as a fighter's style, reach, weight, previous fight record, and current form all influence the fight outcome. This project explores using machine learning techniques to analyze fight data and predict the probability of each fighter winning a matchup. + +## Problem Statement + +MMA fight outcomes are influenced by numerous factors, many of which are dynamic and hard to quantify. This project aims to address: +- **Outcome Variability**: Accurately predicting outcomes amid unpredictable events and variations. +- **Data Limitations**: Working with potentially sparse or incomplete historical data. +- **Feature Engineering**: Identifying significant features to improve prediction accuracy. +- **Generalization**: Ensuring the model works well across various fighters and events. + +## Model Overview + +The model uses historical fight data and fighter statistics to predict fight outcomes. It leverages a mix of machine learning techniques, such as: +- Logistic Regression +- Decision Trees +- Ensemble Methods + +Key features include fighter-specific data like win-loss records, average fight time, strike accuracy, and takedown success rate. By training on historical fight outcomes, the model aims to generalize to future fight predictions. + +## Data + +The model is built using historical MMA data, which includes: +- Fighter statistics: strikes, takedowns, reach, weight, etc. +- Fight records: win/loss record, recent performance, and historical matchup outcomes. +- Event details: location, fight date, and weight class. + +**Note**: Data files should be placed in the `data/` directory in CSV format. Example data files are provided in the repository. +