Methods for multi-site and multi-tissue analysis of DNA methylation data
MetadataShow full item record
DNA methylation is an epigenetic modification that plays an important role in gene regulation. DNA methylation varies between individuals and between tissues in the same individual. Many cohorts have measured DNA methylation in one or more tissues at hundreds of thousands of sites across the genome using methylation microarrays, and a standard analysis approach is to model the relationship between DNA methylation and a phenotype at each site and in each tissue separately. In this thesis, we explore methods for jointly analyzing multiple sites and/or multiple tissues. First, we propose a novel approach to identify differentially methylated regions (DMRs), neighboring sites in a single tissue associated with a phenotype, and compare our approach to two existing approaches to detect DMRs. We show that our method is useful when there are multiple sites in a region with weak or moderate associations with a phenotype. Then, we return to single-site analysis but evaluate methods for analyzing data from multiple tissues, accounting for correlation between two tissue samples from the same individual. We consider methods to model both the mean and variance of methylation sites as well as methods to model mean methylation only. In addition to evaluating existing models, we propose a novel random-effects meta-analysis, which is appropriate for meta-analyzing multiple parameters from correlated studies (or tissues). We show that we have inflated type I error with all meta-analysis methods and methods which model the variance of methylation. Finally, we evaluate methods to incorporate information from multiple sites and multiple tissues in association tests. We examine a gene set analysis method, MAGENTA, which was developed for genetic association studies, and propose an extension that is appropriate for DNA methylation data.