Anomaly detection and dynamic decision making for stochastic systems
This dissertation focuses on two types of problems, both of which are related to systems with uncertainties. The first problem concerns network system anomaly detection. We present several stochastic and deterministic methods for anomaly detection of networks whose normal behavior is not time-varying. Our methods cover most of the common techniques in the anomaly detection field. We evaluate all methods in a simulated network that consists of nominal data, three flow-level anomalies and one packet-level attack. Through analyzing the results, we summarize the advantages and the disadvantages of each method. As a next step, we propose two robust stochastic anomaly detection methods for networks whose normal behavior is time-varying. We develop a procedure for learning the underlying family of patterns that characterize a time-varying network. This procedure first estimates a large class of patterns from network data and then refines it to select a representative subset. The latter part formulates the refinement problem using ideas from set covering via integer programming. Then we propose two robust methods, one model-free and one model-based, to evaluate whether a sequence of observations is drawn from the learned patterns. Simulation results show that the robust methods have significant advantages over the alternative stationary methods in time-varying networks. The final anomaly detection setting we consider targets the detection of botnets before they launch an attack. Our method analyzes the social graph of the nodes in a network and consists of two stages: (i) network anomaly detection based on large deviations theory and (ii) community detection based on a refined modularity measure. We apply our method on real-world botnet traffic and compare its performance with other methods. The second problem considered by this dissertation concerns sequential decision mak- ings under uncertainty, which can be modeled by a Markov Decision Processes (MDPs). We focus on methods with an actor-critic structure, where the critic part estimates the gradient of the overall objective with respect to tunable policy parameters and the actor part optimizes a policy with respect to these parameters. Most existing actor- critic methods use Temporal Difference (TD) learning to estimate the gradient and steepest gradient ascent to update the policies. Our first contribution is to propose an actor-critic method that uses a Least Squares Temporal Difference (LSTD) method, which is known to converge faster than the TD methods. Our second contribution is to develop a new Newton-like actor-critic method that performs better especially for ill-conditioned problems. We evaluate our methods in problems motivated from robot motion control.
Thesis (Ph.D.)--Boston University