Show simple item record

dc.contributor.authorMcMurry, Andrew Johnen_US
dc.date.accessioned2016-01-13T18:47:21Z
dc.date.available2016-01-13T18:47:21Z
dc.date.issued2015
dc.identifier.urihttps://hdl.handle.net/2144/14002
dc.description.abstractHealth information networks continue to expand under the Affordable Care Act yet little research has been done to query and analyze multiple patient populations in parallel. Differences between hospitals relating to patient demographics, treatment approaches, disease prevalences, and medical coding practices all pose significant challenges for multi-site analysis and interpretation. Furthermore, numerous methodological issues arise when attempting to analyze disease association in heterogeneous health care settings. These issues will only continue to increase as greater numbers of hospitals are linked. To address these challenges, I developed the Shared Health Research Informatics Network (SHRINE), a distributed query and analysis system used by more than 60 health institutions for a wide range of disease studies. SHRINE was used to conduct one of the largest comorbidity studies in Autism Spectrum Disorders. SHRINE has enabled population scale studies in diabetes, rheumatology, public health, and pathology. Using Natural Language Processing, we de-identify physician notes and query pathology reports to locate human tissues for high-throughput biological validation. Samples and evidence obtained using these methods supported novel discoveries in human metabolism and paripartum cardiomyopathy, respectively. Each hospital in the SHRINE network hosts a local peer database that cannot be overridden by any federal agency. SHRINE can search both coded clinical concepts and de-identified physician notes to obtain very large cohort sizes for analysis. SHRINE intelligently clusters phenotypic concepts to minimize differences in health care settings. I then analyzed a statewide sample of all Massachusetts acute care hospitals and found diagnoses codes useful for predicting Acute Myocardial Infarction (AMI). The AMI association methods selected 96 clinical concepts. Manual review of PubMed citations supported the automated associations. AMI associations were most often discovered in the circulatory system and were most strongly linked to background diabetic retinopathy, diabetes with renal manifestations, and hypertension with complications. AMI risks were strongly associated with chronic kidney failure, liver diseases, chronic airway obstruction, hemodialysis procedures, and medical device complications. Learning the AMI associated risk factors improved disease predictions for patients in Massachusetts acute care hospitals.en_US
dc.language.isoen_US
dc.rightsAttribution 4.0 Internationalen_US
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectBioinformaticsen_US
dc.subjectDeidentificationen_US
dc.subjectHealth research policyen_US
dc.subjectNLPen_US
dc.subjectPatient privacyen_US
dc.subjectData miningen_US
dc.subjectHealth information technologyen_US
dc.titleDistributed analyses of disease risk and association across networks of de-identified medical systemsen_US
dc.typeThesis/Dissertationen_US
dc.date.updated2015-11-09T14:25:46Z
etd.degree.nameDoctor of Philosophyen_US
etd.degree.leveldoctoralen_US
etd.degree.disciplineBioinformaticsen_US
etd.degree.grantorBoston Universityen_US


This item appears in the following Collection(s)

Show simple item record

Attribution 4.0 International
Except where otherwise noted, this item's license is described as Attribution 4.0 International