Outcome-driven deep clustering for cardiovascular subtyping

OA Version
Citation
Abstract
Common goals in cardiovascular research include understanding factors associated with outcomes and identifying patient subgroups to target interventions to improve cardiovascular health. Clustering patients based on both clinical profiles and outcome associations can resolve subpopulations with shared characteristics within heterogeneous samples, focusing on diagnostic and treatment strategies. We propose novel solutions that leverage an outcome to drive clinically meaningful clusters while reducing data complexity via a supervised autoencoder. We begin with the development of the Outcome-Driven Deep Clustering (ODDC) framework that combines a supervised autoencoder with downstream clustering to compress data into a latent representation optimized for predicting outcomes, supporting the formation of distinct, clinically relevant subgroups. We apply this methodology to cluster cardiopulmonary exercise test (CPET) data, constrained by heart failure (HF) risk scores, to identify subgroups of middle-aged adults with varying risk profiles. Next, we extend ODDC to time-to-event data and embed imputation strategies for datasets with missing values. These models are utilized to uncover patient subgroups with distinct patterns of disease progression, leveraging survival time to characterize long-term risk trajectories in both individuals who develop cardiovascular disease (CVD) and those who continue to be free from it. Lastly, we further extend ODDC to the Outcome-Driven Deep Embedded Clustering (ODDEC) framework that embeds clustering within the supervised autoencoder network for applications to more complex, higher-dimensional data. This model proves successful when clustering metabolomic data, constrained by peak VO2, to identify subpopulations based on exercise capacity and metabolic signatures. By integrating machine learning with clinical insights, these approaches have the potential to revolutionize patient stratification, leading to more effective interventions, and ultimately improving heart health and reducing long-term cardiovascular complications.
Description
2025
License