Mendelian randomization with longitudinal data using functional data analysis approaches

Date
2023
DOI
Authors
Xu, Hanfei
Version
Embargo Date
2026-09-17
OA Version
Citation
Abstract
In the past few decades, causal relationship evaluation has become more of an interest to help understand the underlying disease mechanism. Mendelian randomization (MR) is a useful approach that uses genetic variants as instrumental variables to investigate causal relationships between exposures and complex traits that can potentially overcome confounding in epidemiological studies. However, the conventional MR method only utilizes cross-sectional data. Because data in observational studies are often collected repeatedly over time, not incorporating such longitudinal data from repeated measurements into the analysis will lose a lot of information. Meanwhile, the time-varying effect of the exposure or covariates will be neglected if we only treat them as time-constant. Functional data analysis is a growing field that can treat data as functions. When it comes to the longitudinal setting, those repeated measurements can be considered as functions of time. In this dissertation, we develop methods to leverage longitudinal information from repeatedly measured variables to evaluate the causal relationship between the exposure and the outcome using functional data analysis related approaches. First, we propose multivariable functional MR models that utilize functional principal component analysis (FPCA) to handle multiple time-varying exposures under a multivariable MR framework. We also introduce the concept of mean functional exposure, yielding interpretable causal effect estimates. Our simulation study demonstrates that the proposed models perform better than alternative methods utilizing only a single measurement, in terms of both statistical power and bias of the effect estimate. Second, we develop methods that incorporate FPCA and functional regression to deal with time-varying exposure and time-varying covariates simultaneously in an MR model. Specifically, we implement FPCA based method on continuous time-varying variables and sparse logistic functional principal component analysis on binary time-varying variables. Through simulation studies, we show that our proposed models outperform the models that treat exposure and/or covariates as static measurements in terms of both power and mean squared error. Finally, because the outcome sometimes will be a disease of interest (usually a binary variable), we further make the proposed multivariable functional MR models adaptable to a binary outcome by integrating the multivariable functional MR framework with the two-stage residual inclusion method. We illustrate the application of our proposed models with data from the Framingham Heart Study Offspring cohort to study the causal relationship between obesity indices and various bone health related measures or fractures. Our proposed methods advance the research of causal inference by making better use of longitudinal information, and thus can provide more insights into the relationship between exposures and the outcome of interest.
Description
License