Machine learning for effective predictions and prescriptions in health care
MetadataShow full item record
Early detection of acute hospitalizations and enhancing treatment efficiency is important to improve patients’ long-term life quality and reduce health care costs. This thesis develops data-driven methods to predict important health related events and optimize treatment options. Applications include predicting chronic-disease-related hospitalizations, predicting the effect of interventions, such as In Vitro Fertilization (IVF), and learning and improving upon physicians' prescription policies. For a binary hospitalization classification problem, and to strike a balance between accuracy and interpretability of the prediction, a novel Alternating Clustering and Classification (ACC) method is proposed, which employs an alternating optimization approach that jointly identifies hidden patient clusters and adapts classifiers to each cluster. Convergence and out-of-sample guarantees for this algorithm are established. The algorithm is validated on large data sets from the Boston Medical Center, the largest safety-net hospital system in New England. For the IVF outcome prediction problem, and for women who have difficulty conceiving, several predictive models that estimate IVF success rate are designed. For predicted non-pregnant subjects, an algorithm further predicts whether no embryos were implanted (due to embryo abnormalities) or pregnancy did not occur despite implantation. Results are presented to assess the sensitivity of the models to specific predictive variables. The third problem considered amounts to modeling the patients' disease progression as a Markov Decision Process (MDP), and seeking to estimate the physicians' prescription policy and the disease state transition probabilities. Two regularized maximum likelihood estimation algorithms for learning the transition probability model and policy, respectively, are proposed. A sample complexity result that guarantees a low regret with a relatively small amount of training samples is established. The theoretical results are illustrated using a healthcare example. Finally, the thesis develops a framework for learning and improving the pharmacological therapy algorithm used by physicians to treat type 2 diabetes, based on prescription data. First, the proposed approach predicts the outcomes of prescriptions using regression, and then a policy consistent with physicians' prescriptions using a parametric multi-class classification method is synthesized from data. Then, by optimizing over algorithm parameters in the prescription model, the algorithm is able to achieve better glucose control effects.