Network-Intrusion-Detection-using-DeepLearning

Problem’s overview

With the expanded applications of modern-day networking, network infrastructures are at risk from cyber-attacks and intrusions. Multiple datasets have been proposed in the literature that can be used to create Machine Learning (ML) based Network Intrusion Detection Systems (NIDS). However, many of these datasets suffer from sub-optimal performance and do not adequately and effectively represent all types of intrusions. Another problem with these datasets is the low accuracy of tail classes. To address these issues, we propose the University of Nevada - Reno Intrusion Detection Dataset (UNR-IDD) that provides researchers with a wider range of samples and scenarios.

Note: This is a real world problem you can check link from more information and how the University of Nevada, Reno collect the data https://www.tapadhirdas.com/unr-idd-dataset

Multi-class Classification

The goal of multi-class classification is to differentiate the intrusions not only from normal working conditions but also from each other. Multi-class classification helps us to learn about the root causes of network intrusions. The labels for multi-class classification in UNR-IDD are illustrated in the accompanying table.

Label	Description
Normal	Network Functionality.
TCP-SYN	TCP-SYN Flood.
PortScan	Port Scanning.
Overflow	Flow Table Overflow.
Blackhole	Blackhole Attack.
Diversion	Traffic Diversion Attack.

The Project Methodology

The figure below shows our project pipeline.

Results:

BaseLine Model Accuracy

Min_train_acc	Max_train_acc	AVG_train_acc	Min_test_acc	Max_test_acc	AVG_test_acc	Min_valid_acc	Max_valid_acc	AVG_valid_acc
0.797251	0.812142	0.806109	0.752228	0.768271	0.760071	0.745098	0.771836	0.759358

Feature selection and Dimensionality reduction

	Best Feature Number	AVG_test_acc	AVG_valid_acc
PCA	27	0.7754	0.7683
ANOVA	11	0.778	0.7764
Mutual Information	19	0.769	0.7532

Batch size

	Min_train_acc	Max_train_acc	AVG_train_acc	Min_test_acc	Max_test_acc	AVG_test_acc	Min_valid_acc	Max_valid_acc	AVG_valid_acc
32	0.758687	0.788087	0.769378	0.755793	0.778966	0.770053	0.755793	0.782531	0.765419
64	0.756014	0.76785	0.762047	0.755793	0.784314	0.773262	0.721925	0.754011	0.745098
128	0.757159	0.781214	0.769454	0.757576	0.780749	0.768627	0.748663	0.795009	0.77041

Activation Functions Vs Neurons

1 Hidden Layer

Neurons	Min_train_acc	Max_train_acc	AVG_train_acc	Min_test_acc	Max_test_acc	AVG_test_acc	Min_valid_acc	Max_valid_acc	AVG_valid_acc
10	0.752577	0.772432	0.763727	0.754011	0.777184	0.768984	0.730838	0.775401	0.752941
20	0.769378	0.78236	0.775105	0.768271	0.789661	0.781105	0.759358	0.791444	0.773975
30	0.775105	0.796105	0.784422	0.778966	0.818182	0.802139	0.773619	0.798574	0.785383
40	0.777396	0.802215	0.788698	0.775401	0.807487	0.793583	0.770053	0.798574	0.788235

2 Hidden Layer

Neurons	Min_train_acc	Max_train_acc	AVG_train_acc	Min_test_acc	Max_test_acc	AVG_test_acc	Min_valid_acc	Max_valid_acc	AVG_valid_acc
10	0.766705	0.783123	0.777854	0.775401	0.800357	0.78574	0.746881	0.777184	0.768627
20	0.780069	0.786942	0.784422	0.782531	0.802139	0.7918	0.770053	0.793226	0.780036
30	0.783887	0.79496	0.79076	0.793226	0.816399	0.802852	0.780749	0.793226	0.78574
40	0.789996	0.809851	0.799771	0.798574	0.827094	0.810695	0.782531	0.809269	0.795722

4 Hidden Layer

Neurons	Min_train_acc	Max_train_acc	AVG_train_acc	Min_test_acc	Max_test_acc	AVG_test_acc	Min_valid_acc	Max_valid_acc	AVG_valid_acc
10	0.765559	0.791905	0.779305	0.764706	0.800357	0.783957	0.775401	0.787879	0.783957
20	0.779305	0.807178	0.793127	0.784314	0.821747	0.804635	0.770053	0.809269	0.792513
30	0.803742	0.832761	0.815426	0.803922	0.825312	0.814617	0.787879	0.83779	0.812478
40	0.800305	0.831233	0.818175	0.805704	0.83779	0.823173	0.800357	0.818182	0.808913

8 Hidden Layer

Neurons	Min_train_acc	Max_train_acc	AVG_train_acc	Min_test_acc	Max_test_acc	AVG_test_acc	Min_valid_acc	Max_valid_acc	AVG_valid_acc
10	0.680794	0.76365	0.734555	0.7041	0.777184	0.748307	0.68984	0.762923	0.739394
20	0.771287	0.835815	0.804811	0.762923	0.819964	0.748307	0.780749	0.816399	0.804991
30	0.799542	0.818633	0.811531	0.798574	0.836007	0.818538	0.802139	0.821747	0.810695
40	0.806796	0.827797	0.817717	0.800357	0.84492	0.824599	0.807487	0.825312	0.814617

Optimizer

Optimizer	Min_train_acc	Max_train_acc	AVG_train_acc	Min_test_acc	Max_test_acc	AVG_test_acc	Min_valid_acc	Max_valid_acc	AVG_valid_acc
AdamW_lr_0.1	0.254296	0.369225	0.277281	0.226381	0.397504	0.260606	0.251337	0.331551	0.26738
AdamW_lr_0.01	0.669721	0.791142	0.758381	0.688057	0.803922	0.77148	0.684492	0.805704	0.76934
AdamW_lr_0.001	0.797633	0.818251	0.809927	0.791444	0.828877	0.810695	0.802139	0.823529	0.814973
SGD_m_0.1_lr_0.1	0.747614	0.783123	0.759832	0.764706	0.789661	0.77148	0.754011	0.782531	0.766845
SGD_m_0.1_lr_0.01	0.731195	0.758687	0.744635	0.723708	0.762923	0.753298	0.73262	0.766488	0.749376
SGD_m_0.1_lr_0.001	0.693394	0.746468	0.716991	0.698752	0.773619	0.735116	0.686275	0.759358	0.718004
SGD_m_0.5_lr_0.1	0.691867	0.789614	0.759756	0.7041	0.807487	0.768627	0.695187	0.795009	0.766132
SGD_m_0.5_lr_0.01	0.775487	0.804124	0.789309	0.773619	0.802139	0.790374	0.766488	0.809269	0.792513
SGD_m_0.5_lr_0.001	0.745323	0.761741	0.752348	0.752228	0.770053	0.759002	0.752228	0.768271	0.756506
SGD_m_0.9_lr_0.1	0.620084	0.744177	0.703475	0.631016	0.762923	0.716221	0.631016	0.759358	0.713725
SGD_m_0.9_lr_0.01	0.768614	0.80756	0.795036	0.768271	0.814617	0.801426	0.778966	0.816399	0.802496
SGD_m_0.9_lr_0.001	0.768996	0.793051	0.785185	0.775401	0.802139	0.790018	0.768271	0.800357	0.788235
RMSprop_lr_0.1	0.247805	0.468118	0.332264	0.226381	0.508021	0.347237	0.251337	0.449198	0.330838
RMSprop_lr_0.01	0.723177	0.771287	0.741733	0.736185	0.786096	0.757932	0.743316	0.786096	0.760428
RMSprop_lr_0.001	0.771287	0.808706	0.787247	0.773619	0.812834	0.790731	0.768271	0.812834	0.787166

Activation Function

	Min_train_acc	Max_train_acc	AVG_train_acc	Min_test_acc	Max_test_acc	AVG_test_acc	Min_valid_acc	Max_valid_acc	AVG_valid_acc
Relu	0.80718	0.82818	0.8155	0.80927	0.83066	0.8164	0.79679	0.81996	0.8107
Leaky_relu	0.804124	0.82054	0.81191	0.80392	0.84136	0.81854	0.80392	0.823529	0.81177
Sigmoid	0.2543	0.57427	0.44162	0.22638	0.61497	0.44385	0.25134	0.61141	0.46631
Tanh	0.79572	0.80298	0.80046	0.79857	0.81283	0.80535	0.80927	0.827094	0.8189

Comparing Results

T-SNE Figures

Original Dataset

PCA Transformed Dataset

ANOVA Selected Features

Confusion Matrix

Base Line Model	After The Feature Selection	After The Parameter Selection

Best Parameter Effect on the Testing Accuracy

Parameters	Best Choice	Test Accuracy
Feature Selection	ANOVA	0.7786
Batch Size	64	0.7732
Hidden Layer + Neurons	8 & 40	0.815
Optimizer	AdamW with lr .001	0.811
Activation Function	LeakyRelu	0.82

Conclusion

The optimization pipeline improved:

The data representation in 16 features instead of 33
Data separation improved as shown in the T-SNE Figures
The Model improvement to classify the different class improved significantly as shown in the confusion matrices

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Image		Image
README.md		README.md
img.png		img.png
img_1.png		img_1.png
img_2.png		img_2.png
network-intrusion-detection-using-deep-learning.ipynb		network-intrusion-detection-using-deep-learning.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Network-Intrusion-Detection-using-DeepLearning

Problem’s overview

Multi-class Classification

The Project Methodology

Results:

BaseLine Model Accuracy

Feature selection and Dimensionality reduction

Batch size

Activation Functions Vs Neurons

Optimizer

Activation Function

Comparing Results

T-SNE Figures

Confusion Matrix

Best Parameter Effect on the Testing Accuracy

Conclusion

About

Releases

Packages

Contributors 2

Languages

girgismicheal/Network-Intrusion-Detection-using-DeepLearning

Folders and files

Latest commit

History

Repository files navigation

Network-Intrusion-Detection-using-DeepLearning

Problem’s overview

Multi-class Classification

The Project Methodology

Results:

BaseLine Model Accuracy

Feature selection and Dimensionality reduction

Batch size

Activation Functions Vs Neurons

Optimizer

Activation Function

Comparing Results

T-SNE Figures

Confusion Matrix

Best Parameter Effect on the Testing Accuracy

Conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages