Understanding the genetic basis of complex polygenic traits through Bayesian model selection of multiple genetic models and network modeling of family-based genetic data
Bae, Harold Taehyun
MetadataShow full item record
The global aim of this dissertation is to develop advanced statistical modeling to understand the genetic basis of complex polygenic traits. In order to achieve this goal, this dissertation focuses on the development of (i) a novel methodology to detect genetic variants with different inheritance patterns formulated as a Bayesian model selection problem, (ii) integration of genetic data and non-genetic data to dissect the genotype-phenotype associations using Bayesian networks with family-based data, and (iii) an efficient technique to model the family-based data in the Bayesian framework. In the first part of my dissertation, I present a coherent Bayesian framework for selection of the most likely model from the five genetic models (genotypic, additive, dominant, co-dominant, and recessive) used in genetic association studies. The approach uses a polynomial parameterization of genetic data to simultaneously fit the five models and save computations. I provide a closed-form expression of the marginal likelihood for normally distributed data, and evaluate the performance of the proposed method and existing methods through simulated and real genome-wide data sets. The second part of this dissertation presents an integrative analytic approach that utilizes Bayesian networks to represent the complex probabilistic dependency structure among many variables from family-based data. I propose a parameterization that extends mixed effects regression models to Bayesian networks by using random effects as additional nodes of the networks to model the between-subjects correlations. I also present results of simulation studies to compare different model selection metrics for mixed models that can be used for learning BNs from correlated data and application of this methodology to real data from a large family-based study. In the third part of this dissertation, I describe an efficient way to account for family structure in Bayesian inference Using Gibbs Sampling (BUGS). In linear mixed models, a random effects vector has a variance-covariance matrix whose dimension is as large as the sample size. However, a direct handling of this multivariate normal distribution is not computationally feasible in BUGS. Therefore, I propose a decomposition of this multivariate normal distribution into univariate normal distributions using singular value decomposition, and implementation in BUGS is presented.