Use of natural language processing and a machine learning model to identify patients presenting with various types of status epilepticus

OA Version
Citation
Abstract
BACKGROUND: Status epilepticus (SE) is a frequent pediatric emergency that necessitates urgent response and adherence to proper emergency treatment protocols. Patients with a history of prolonged seizures and status epilepticus are at risk of SE recurrence. Identification of patients with a history of SE may permit patient monitoring and interventions and ultimately improve treatment and patient outcomes. The gold standard for patient identification via electronic health records is manual review. We sought to use natural language processing (NLP) and employ Regular Expressions in the Document Review Tool (DrT) software to efficiently identify patients presenting with established status epilepticus (ESE) and refractory status epilepticus (rSE). METHODS: We obtained electronic health records for patients between the ages of 1 month and 21 years old who presented to Boston Children’s Hospital (BCH). The Clinical Research Informatics Technology (CRIT) team at BCH obtained records from 2013 to 2020 from specific emergency and neurology departments. We included all patients from the above notes at BCH who experienced at least one convulsive ESE or rSE event during their admission. ESE is defined as the failure to respond to first-line benzodiazepine treatment but successfully achieve seizure control following the first second-line non-benzodiazepine treatment. In contrast, rSE is defined as the failure to respond to first-line benzodiazepine treatment and a single second-line treatment but reach seizure termination following a subsequent second-line treatment administration. A set of records from 2017-2019 trained the tool using a Support Vector Machine and polynomial kernel with a machine learning-NLP algorithm (Chafjiri et al, 2023). We then used the tool to identify patients with ESE and rSE in 2013-2020 excluding 2017-2019. We scored each note based on the Regular Expressions (RegEx) machine-learning score. Those with higher scores and above a cut-off score were more likely to be cases. We decided on a cutoff score of -10,984,087 for this study based on the Receiver Operating Characteristic (ROC) curve from the previously trained tool to ensure 94% accuracy and 94% specificity. To further evaluate the effectiveness of DrT-assisted review, we compared the results to the manual review of notes from the pediatric Status Epilepticus Research Group (pSERG) consortium screening log during the test period. RESULTS: We identified 169 patients using DrT software with a sensitivity of 98.2% (95% CI: 0.95-1.0), whereas only 115 patients were identified with rSE during manual review with a sensitivity of 68% (95% CI: 0.61-0.75). We identified 91 patients with ESE using manual review with a sensitivity of 43.8% (CI: 0.37-0.51) compared to the identification of 208 patients using DrT software with a sensitivity of 99.5% (CI: 0.97-1.0). DrT missed a total of 4 cases that were successfully identified in manual review. Of those 4 cases, 3 were identified as rSE, and 1 was identified as ESE. A total of 175 cases, 57 rSE, and 118 ESE, were identified by DrT that were not found during the manual review. CONCLUSION: DrT-assisted review identified a much higher number of patients compared to manual review of individuals experiencing either established or refractory SE. NLP-related software can enhance patient identification and aid in future treatment protocols, allowing improved studies and acute and preventative care interventions.
Description
2024
License