Integrative multi-omic network strategies for unraveling complex disease biology and the identification of novel phenotype associated genes
MetadataShow full item record
Identifying the genetic risk factors underlying a given disease is an essential step for informing effective drug targets, understanding disease architecture, and predicting at-risk individuals. A commonly applied approach for identifying novel disease-associated genes is the Genome Wide Association Study (GWAS) approach, in which a high number of individuals are sequenced and genetic variants are then tested for an association with disease status. While the GWAS approach has identified countless disease-associated genes, there remain plenty of diseases for which our genetic understanding is still incomplete. One strategy for augmenting the GWAS approach is to incorporate additional omics data in order to prioritize biologically plausible candidate genes. In this thesis work, we integrate network-based strategies with existing genetic analysis pipelines in order to identify novel Alzheimer’s disease (AD) genes. Two types of biological data inform the underlying structure of the networks: a) protein-protein interactions and b) gene expression in the human brain. Genes which interact or are co-expressed across similar conditions have been shown to have a higher probability of being functionally related. Using a set or previously known AD genes, we apply a network propagation strategy to score genes based upon their proximity to the known AD genes within these networks. Then we integrate the network score of each gene with its risk score from GWAS to identify novel candidates. To further affirm the reproducibility of findings, we further incorporate additional information in the form of knockout models in flies, bootstrap aggregation, and external genetic datasets. In addition to predicting novel genes, we are able to utilize regional co-expression networks to further understand how the known AD genes behave within the various sub-divisions of the brain. We find that regions of the brain which are known to have the earliest vulnerability to AD-induced neurodegeneration also tend to be where AD genes are highly correlated.