-
Notifications
You must be signed in to change notification settings - Fork 13
Mining Network Anomalies
In most simple terms, the set of data points that are considerably different than the remainder of the data are considered as anomalies in heterogeneous environments.
In the present world huge amounts of data are stored and transferred from one location to another. The data when transferred or stored is primed exposed to attack. Although various techniques or applications are available to protect data, loopholes exist. Thus to analyze data and to determine various kind of attack data mining techniques have emerged to make it less vulnerable. Anomaly detection uses these data mining techniques to detect the surprising behavior hidden within data increasing the chances of being intruded or attacked. Various hybrid approaches have also been made in order to detect known and unknown attacks more accurately.[1]
- There are considerably more “normal” observations than “abnormal” observations (outliers/anomalies) in the data
- How many outliers are there in the data?
- Method is unsupervised �** Validation can be quite challenging (just like for clustering)
- Finding needle in a haystack
- Types of anomaly detection schemes ** Graphical & Statistical-based *** Dimensional plots (e.g. box plot, scatter ploy etc.) *** Convex Hull Method *** Statistical Tests (e.g. Likelihood) ** Distance-based *** Nearest-neighbor based *** Density based *** Clustering based ** Model-based
[1] Shikha Agrawal, Jitendra Agrawal, Survey on Anomaly Detection using Data Mining Techniques, Procedia Computer Science, Volume 60, 2015, Pages 708-713, ISSN 1877-0509, http://dx.doi.org/10.1016/j.procs.2015.08.220. (http://www.sciencedirect.com/science/article/pii/S1877050915023479) Keywords: Anomaly Detection; Clustering; Classification; Data Mining; Intrusion Detection System. [2] https://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap10_anomaly_detection.pdf