Distributed analyses of disease risk and association across networks of de-identified medical systems
McMurry, Andrew John
MetadataShow full item record
Health information networks continue to expand under the Affordable Care Act yet little research has been done to query and analyze multiple patient populations in parallel. Differences between hospitals relating to patient demographics, treatment approaches, disease prevalences, and medical coding practices all pose significant challenges for multi-site analysis and interpretation. Furthermore, numerous methodological issues arise when attempting to analyze disease association in heterogeneous health care settings. These issues will only continue to increase as greater numbers of hospitals are linked. To address these challenges, I developed the Shared Health Research Informatics Network (SHRINE), a distributed query and analysis system used by more than 60 health institutions for a wide range of disease studies. SHRINE was used to conduct one of the largest comorbidity studies in Autism Spectrum Disorders. SHRINE has enabled population scale studies in diabetes, rheumatology, public health, and pathology. Using Natural Language Processing, we de-identify physician notes and query pathology reports to locate human tissues for high-throughput biological validation. Samples and evidence obtained using these methods supported novel discoveries in human metabolism and paripartum cardiomyopathy, respectively. Each hospital in the SHRINE network hosts a local peer database that cannot be overridden by any federal agency. SHRINE can search both coded clinical concepts and de-identified physician notes to obtain very large cohort sizes for analysis. SHRINE intelligently clusters phenotypic concepts to minimize differences in health care settings. I then analyzed a statewide sample of all Massachusetts acute care hospitals and found diagnoses codes useful for predicting Acute Myocardial Infarction (AMI). The AMI association methods selected 96 clinical concepts. Manual review of PubMed citations supported the automated associations. AMI associations were most often discovered in the circulatory system and were most strongly linked to background diabetic retinopathy, diabetes with renal manifestations, and hypertension with complications. AMI risks were strongly associated with chronic kidney failure, liver diseases, chronic airway obstruction, hemodialysis procedures, and medical device complications. Learning the AMI associated risk factors improved disease predictions for patients in Massachusetts acute care hospitals.