Interrogating signaling pathways using transcriptomic and proteomic data
Parikh, Jignesh Rajesh
MetadataShow full item record
Information from the cell surface is propagated to the nucleus via interconnected signaling pathways that regulate transcription factors, thereby controlling cellular processes, such as migration, proliferation and differentiation through context-dependent gene expression. Aberrations in signaling pathways have been associated with several human diseases such as cancer. In this work, I develop novel tools and algorithms to identify altered signaling pathways and the relationships between them from high-throughput gene expression microarray and tandem mass-spectrometry data. First, I integrate pathway annotations with publicly available lists of differentially expressed genes to identify transcriptional dependencies between pairs of pathways, revealing a modular pathway co-differential expression meta-network. The meta-network approach is also applied to gene sets defined by cytogenetic bands to understand the relationship between pathways and genome organization; notably, co-differentially expressed chromosome loci are more proximal in three-dimensional space and consist of genes that participate in the same pathway to a greater extent than loci that are not co-expressed. The pathway co-differential expression network, along with other gene set co-differential expression meta-networks, is made available through a web tool called MetaNet that analyzes user-defined gene lists in the context of a network of transcriptionally dependent pathways. Next, I develop Signaling Pathway Enrichment using Experimental Data sets (SPEED), a manually curated data collection and algorithm that allows for identification of upstream signaling pathways that cause an observed gene expression pattern. The intuition behind SPEED is that there are distinct gene expression signatures per signaling pathway perturbation; the signatures can be subsequently used to predict upstream signaling pathways from a user-defined list of differentially expressed genes. SPEED signatures cluster signaling pathways into two distinct groups separating immune response and cell growth pathways. The SPEED algorithm and data collection is publicly available for analysis of user-defined gene lists via a web server. Finally, I define and implement a minimal API (mzAPI) through an open source desktop application called multiplierz that provides direct, programmatic interaction with binary raw mass spectrometry files. I use multiplierz to analyze proteomics data for the identification of specific phosphorylation events in signaling cascades. Collectively, these novel tools and algorithms contribute to an improved understanding of signaling pathways.
Thesis (Ph.D.)--Boston UniversityPLEASE NOTE: Boston University Libraries did not receive an Authorization To Manage form for this thesis or dissertation. It is therefore not openly accessible, though it may be available by request. If you are the author or principal advisor of this work and would like to request open access for it, please contact us at email@example.com. Thank you.