Statistical methods for genetic association studies: detecting gene x environment interaction in rare variant analysis
MetadataShow full item record
Investigators have discovered thousands of genetic variants associated with various traits using genome-wide association studies (GWAS). These discoveries have substantially improved our understanding of the genetic architecture of many complex traits. Despite the striking success, these trait-associated loci collectively explain relatively little of disease risk. Many reasons for this unexplained heritability have been suggested and two understudied components are hypothesized to have an impact in complex disease etiology: rare variants and gene-environment (GE) interactions. Advances in next generation sequencing have offered the opportunity to comprehensively investigate the genetic contribution of rare variants on complex traits. Such diseases are multifactorial, suggesting an interplay of both genetics and environmental factors, but most GWAS have focused on the main effects of genetic variants and disregarded GE interactions. In this dissertation, we develop statistical methods to detect GE interactions for rare variant analysis for various types of outcomes in both independent and related samples. We leverage the joint information across a set of rare variants and implement variance component score tests to reduce the computational burden. First, we develop a GE interaction test for rare variants for binary and continuous traits in related individuals, which avoids having to restrict to unrelated individuals and thereby retaining more samples. Next, we propose a method to test GE interactions in rare variants for time-to-event outcomes. Rare variant tests for survival outcomes have been underdeveloped, despite their importance in medical studies. We use a shrinkage method to impose a ridge penalty on the genetic main effects to deal with potential multicollinearity. Finally, we compare different types of penalties, such as least absolute shrinkage selection operator and elastic net regularization, to examine the performance of our second method under various simulation scenarios. We illustrate applications of the proposed methods to detect gene x smoking interaction influencing body mass index and time-to-fracture in the Framingham Heart Study. Our proposed methods can be readily applied to a wide range of phenotypes and various genetic epidemiologic studies, thereby providing insight into biological mechanisms of complex diseases, identifying high-penetrance subgroups, and eventually leading to the development of better diagnostics and therapeutic interventions.