Show simple item record

dc.contributor.authorLu, Chenen_US
dc.date.accessioned2016-01-27T18:43:13Z
dc.date.available2016-01-27T18:43:13Z
dc.date.issued2013
dc.identifier.urihttps://hdl.handle.net/2144/14107
dc.description.abstractGenetic variants identified to date by genome-wide association studies only explain a small fraction of total heritability. Gene-by-gene interaction is one important potential source of unexplained heritability. In the first part of this dissertation, a novel approach to detect such interactions is proposed. This approach utilizes penalized regression and sparse estimation principles, and incorporates outside biological knowledge through a network-based penalty. The method is tested on simulated data under various scenarios. Simulations show that with reasonable outside biological knowledge, the new method performs noticeably better than current stage-wise strategies in finding true interactions, especially when the marginal strength of main effects is weak. The proposed method is designed for single-cohort analyses. However, it is generally acknowledged that only multi-cohort analyses have sufficient power to uncover genes and gene-by-gene interactions with moderate effects on traits, such as likely underlie complex diseases. Multi-cohort, meta-analysis approaches for penalized regressions are developed and investigated in the second part of this dissertation. Specifically, I propose two different ways of utilizing data-splitting principles in multi-cohort settings and develop three procedures to conduct meta-analysis. Using the method developed in the first part of this dissertation as an example of penalized regressions, three proposed meta-analysis procedures are compared to mega-analysis using a simulation study. The results suggest that the best approach is to split the participating cohorts into two groups, to perform variable selection for each cohort in the first group, to fit regular regression model on the union of selected variables for each cohort in the second group, and lastly to conduct a meta-analysis across cohorts in the second group. In the last part of this dissertation, the novel method developed in the first part is applied to the Framingham Heart Study measures on total plasma Immunoglobulin E (IgE) concentrations, C-reactive protein levels, and Fasting Glucose. The effect of incorporating various sources of biological information on the ability to detect gene-gene interaction is explored. For IgE, for example, a number of potentially interesting interactions are identified. Some of these interactions involve pairs in human leukocyte antigen genes, which encode proteins that are the key regulators of the immune response. The remaining interactions are among genes previously found to be associated with IgE as main effects. Identification of these interactions may provide new insights into the genetic basis and mechanisms of atopic diseases.en_US
dc.language.isoen_US
dc.subjectBiostatisticsen_US
dc.subjectGene-by-geneen_US
dc.subjectGenome wide associationen_US
dc.subjectInteractionen_US
dc.subjectMeta analysisen_US
dc.subjectNetworken_US
dc.subjectPenalized regressionen_US
dc.titleNew approaches to identify gene-by-gene interactions in genome wide association studiesen_US
dc.typeThesis/Dissertationen_US
dc.date.updated2016-01-22T18:53:52Z
etd.degree.nameDoctor of Philosophyen_US
etd.degree.leveldoctoralen_US
etd.degree.disciplineBiostatisticsen_US
etd.degree.grantorBoston Universityen_US


This item appears in the following Collection(s)

Show simple item record