Statistical topics relating to computer network anomaly detection

Date
2012
DOI
Authors
Ding, Qi
Version
Embargo Date
Indefinite
OA Version
Citation
Abstract
This dissertation makes fundamental contributions to statistical methods relating to the detection of anomalies in the context of computer network traffic monitoring. In particular, it contributes basic statistical tools for socially-based network anomaly characterization and detection, it extends a popular detection methodology to high-dimensional contexts, and it demonstrates that standard flow sampling can interact with inherent network topology in ways unexpected. In the first contribution of my research, I define anomalous intrusion in terms of locations in social space, rather than in physical space. I develop statistical detectors based on simple graph-based summaries of the network, with a focus on detecting anti-social behaviors. This research suggests that certain values of local graphical measurements, like clustering coefficients and betweenness centrality, are associated with the malicious antisocial behaviors in the types of network representations of IP flow measurements used in this work. This motivates me to propose a simple, efficient and robust anomaly detection technique. I evaluate this methodology on different network representations and using different social summaries. In the second contribution of my research, I extend the use of the PCA subspace method to high-dimensional spaces. Specifically, I show that, under appropriate conditions,with high probability the magnitude of the residuals of a standard PCA subspace analysis of randomly projected data behaves comparably to that of the residuals of a similar PCA analysis of the original data. My results indicate the feasibility of applying subspacebased anomaly detection algorithms to Gaussian random projection data. This concept is illustrated in the context of computer network traffic anomaly detection for the purpose of detecting volume anomalies. The impact of sampling on so-called Peer-to-Peer (P2P) network analysis is the focus of the third contribution of my research. In this research I use a combination of probability calculations and simulation techniques to characterize the extent to which standard packet sampling in the Internet can adversely affect the topology of stylized versions of Bittorrent download networks reconstructed from measurements of network flows. The results indicate that a certain stratification observed in these networks impacts the reconstructed topology in ways decidedly different from typical networks which have no stratification.
Description
Thesis (Ph.D.)--Boston University
PLEASE NOTE: Boston University Libraries did not receive an Authorization To Manage form for this thesis or dissertation. It is therefore not openly accessible, though it may be available by request. If you are the author or principal advisor of this work and would like to request open access for it, please contact us at open-help@bu.edu. Thank you.
License