Show simple item record

dc.contributor.authorAksoylar, Cemen_US
dc.date.accessioned2017-03-16T17:24:46Z
dc.date.available2017-03-16T17:24:46Z
dc.date.issued2017
dc.identifier.urihttps://hdl.handle.net/2144/20836
dc.description.abstractMany learning and inference problems involve high-dimensional data such as images, video or genomic data, which cannot be processed efficiently using conventional methods due to their dimensionality. However, high-dimensional data often exhibit an inherent low-dimensional structure, for instance they can often be represented sparsely in some basis or domain. The discovery of an underlying low-dimensional structure is important to develop more robust and efficient analysis and processing algorithms. The first part of the dissertation investigates the statistical complexity of sparse recovery problems, including sparse linear and nonlinear regression models, feature selection and graph estimation. We present a framework that unifies sparse recovery problems and construct an analogy to channel coding in classical information theory. We perform an information-theoretic analysis to derive bounds on the number of samples required to reliably recover sparsity patterns independent of any specific recovery algorithm. In particular, we show that sample complexity can be tightly characterized using a mutual information formula similar to channel coding results. Next, we derive major extensions to this framework, including dependent input variables and a lower bound for sequential adaptive recovery schemes, which helps determine whether adaptivity provides performance gains. We compute statistical complexity bounds for various sparse recovery problems, showing our analysis improves upon the existing bounds and leads to intuitive results for new applications. In the second part, we investigate methods for improving the computational complexity of subgraph detection in graph-structured data, where we aim to discover anomalous patterns present in a connected subgraph of a given graph. This problem arises in many applications such as detection of network intrusions, community detection, detection of anomalous events in surveillance videos or disease outbreaks. Since optimization over connected subgraphs is a combinatorial and computationally difficult problem, we propose a convex relaxation that offers a principled approach to incorporating connectivity and conductance constraints on candidate subgraphs. We develop a novel nearly-linear time algorithm to solve the relaxed problem, establish convergence and consistency guarantees and demonstrate its feasibility and performance with experiments on real networks.en_US
dc.language.isoen_US
dc.subjectElectrical engineeringen_US
dc.subjectConvex optimizationen_US
dc.subjectLearning on networksen_US
dc.subjectSparse recoveryen_US
dc.subjectStatistical complexityen_US
dc.subjectStatistical signal processingen_US
dc.titleDiscovery of low-dimensional structure in high-dimensional inference problemsen_US
dc.typeThesis/Dissertationen_US
dc.date.updated2017-03-10T02:07:22Z
etd.degree.nameDoctor of Philosophyen_US
etd.degree.leveldoctoralen_US
etd.degree.disciplineElectrical & Computer Engineeringen_US
etd.degree.grantorBoston Universityen_US


This item appears in the following Collection(s)

Show simple item record