Genetic association studies of Alzheimer disease using multi-phenotype tests and gene-based tests
MetadataShow full item record
The genome-wide association study (GWAS) approach has identified novel loci for a variety of complex diseases. However, for most of these disorder much of the heritability is not explained by this approach, which focuses on identifying common variants that are associated with disease risk. The unexplained heritability may be due to genetic or phenotypic heterogeneity or the influence of rare variants. The motivation behind this thesis was to uncover the unexplained heritability by applying joint analyses of sets of variants (gene-based association test) and multiple disease-related phenotypes (called multivariate gene-based association test). First, we evaluated multivariate gene-based methods for detecting association of common genetic variants with correlated phenotypes. An extensive simulation study showed that the method combining the MultiPhen and GATES software performed best for most tested scenarios especially when correlations among phenotypes are relatively low. We developed a new multivariate gene-based test using rare variants called VEMPHAS. A simulation study using VEMPHAS showed that this method correctly controls for type I error in all tested scenarios. We applied VEMPHAS to analysis of various phenotypes related to Alzheimer disease (AD) and found suggestive association (P < 4.15x10-6) with the gene TRIM22, which has been identified in a previous sequencing study of AD onset in PSEN1/2 mutation carriers. We also developed software with a graphical user interface which is designed for integrating information from different types of data sources including genetic data (from GWAS or sequencing), expression data (from RNA-Seq), and protein structures (from protein data banks). This software has several features including 1) testing associations between genetic variants and gene expressions; 2) locating amino acids, encoded by the variants, in a protein structure; and 3) retrieving genetic locations (chromosome and base pair positions) of amino acids of interest in the protein structure. The last feature can be applied for prioritizing coding variants for gene-based association testing. The methods and strategies developed for this dissertation project can effectively uncover a portion of the remaining heritability of complex diseases that is unexplained by traditional GWAS approaches.