Statistical methods for genetic association studies: multi-cohort and rare genetic variants approaches
Genetic association studies have successfully identified many genetic markers associated with complex human diseases and related quantitative traits. However, for most complex diseases and quantitative traits, all associated genetic markers identified to date only explain a small proportion of heritability. Thus, exploring the unexplained heritability in these traits will help us discover novel genetic determinants for these traits and better understand disease etiology and pathophysiology. Due to limited sample size, a single cohort study may not have sufficient power to identify novel genetic association with a small effect size, and meta-analysis approaches have been proposed and applied to combine results from multiple cohorts in large consortia, increasing the sample size and statistical power. Rare genetic variants and gene by environment interaction may both play a role in genetic association studies. In this dissertation, we develop statistical methods in meta-analysis, rare genetic variants analysis and gene by environment interaction analysis, conduct extensive simulation studies, and apply these methods in real data examples. First, we develop a method of moments estimator for the between-study covariance matrix in random effects model multivariate meta-analysis. Our estimator is the first such estimator in matrix form, and holds the invariance property to linear transformations. It has similar performance with existing methods in simulation studies and real data analysis. Next, we extend the Sequence Kernel Association Test (SKAT), a rare genetic variants analysis approach for unrelated individuals, to be applicable in family samples for quantitative traits. The extension is necessary, as the original test has inflated type I error when directly applied to related individuals, and selecting an unrelated subset from family samples reduces the sample size and power. Finally, we derive methods for rare genetic variants analysis in detecting gene by environment interaction on quantitative traits, in the context of univariate test on the interaction term parameter. We develop statistical tests in the settings of both burden test and SKAT, for both unrelated and related individuals. Our methods are relevant to genetic association studies, and we hope that they can facilitate research in this field and beyond.