ADVERTISEMENT
Journal Watch: Natural Language Processing to Identify Stroke
Reviewed This Month
Improving Prehospital Stroke Diagnosis Using Natural Language Processing of Paramedic Reports
Authors: Mayampurath A, Parnianpour Z, Richards CT, et al.
Published in: Stroke, 2021 Aug; 52(8): 2,676–9
We’ve reviewed a few stroke studies this year. While we’ve learned a lot, there’s still a lot we don’t know (like which prehospital stroke scale is best). One thing we know for sure, though, is that early identification and treatment lead to better patient outcomes.
The study we review in our final Journal Watch of 2021 is a step into the future: The authors used natural language processing and machine learning to predict hospital-confirmed stroke diagnosis using paramedic narratives. Natural language processing enables computers to process human language—in this case in the form of text—and find “features” much quicker than a human could read through multiple narratives on an EMS report. Machine learning allows computers to use data and algorithms to gradually improve accuracy and understand differences in a data set.
Study Parameters
The study population was 9-1-1 patients transported by EMS to one of 17 primary or comprehensive stroke centers in Chicago. The study took place from November 28, 2018 to May 31, 2019. The authors analyzed data from prehospital care reports and the Get With The Guidelines–Stroke registries at the 17 stroke centers. The American Heart Association describes Get With The Guidelines–Stroke as an in-hospital program for improving stroke care by promoting consistent adherence to the latest scientific treatment guidelines.
The authors considered the patient an EMS suspected stroke patient if suspected stroke appeared in the narrative, if the patient was transported to a stroke center, or if an abnormal prehospital stroke scale was documented. The authors included all hospital-confirmed stroke patients in the analysis even if they did not meet the definition of EMS suspected stroke. Patients diagnosed with TIAs were excluded.
The primary outcome of interest was acute stroke diagnoses. The authors examined secondary outcomes including severe stroke (defined as an NIHSS score greater than 5), acute stroke with large vessel occlusion, and intracerebral/subarachnoid hemorrhage. The primary features of interest were single words within the EMS report.
To facilitate the machine learning analysis, the data set was randomly split 70/30. The 30% was used to develop and test the logistic regression models—in other words, to train the computer to run the most appropriate analysis. The final analysis focused on evaluating the association between the outcomes of interest and the clinical terms unilaterality, weakness, slurred speech, facial droop, and minutes ago. Each of these clinical terms appeared in paramedic narratives for hospital-confirmed stroke patients more often than for patients who did not have a diagnosed stroke. These text-based models were compared to models that used the Cincinnati Stroke Scale score and the Three-Item Stroke Scale to predict hospital-confirmed
stroke diagnosis.
Results
There were 580 hospital-confirmed stroke patients included in the final analysis. The average age was 65 years. There were 3% more females than males. There were more patients with a race documented as Black (226, 39%), followed by white (190, 33%), Hispanic/Latino (79, 14%), unknown (60, 10%), and other (25, 4%). The average heart rate was 87 bpm, the average systolic blood pressure was 159 mmHg, and the average oxygen saturation was 96%. There were 264 (46%) patients who met the definition of severe stroke, 84 (15%) patients had an acute stroke with large vessel occlusion, and 129 (22%) had an intracerebral/subarachnoid hemorrhage. There were 22 (4%) stroke patients and 12 (3%) patients that were not diagnosed with stroke who died in-hospital. There was no statistically significant difference in age, sex, race/ethnicity, initial vital signs, or in-hospital mortality when comparing stroke patients to those not diagnosed with stroke.
There was no statistically significant difference noted when comparing the text-based regression model to the Cincinnati Stroke Scale score model (p=0.165). However, the text-based model performed statistically better than the Three-Item Stroke Scale model (p<0.001). The text-based model also performed statistically better at predicting severe stroke compared to both the Cincinnati (p<0.01) and Three-Item (p<0.001) models. Similarly, the text-based model also performed statistically better at predicting patients who had either an acute stroke with large vessel occlusion or an intracerebral/subarachnoid hemorrhage when compared to both the Cincinnati (p<0.01) and Three-Item (p<0.001) models. There was no statistically significant difference among the models when predicting acute stroke with large vessel occlusion alone.
Conclusion
As with all studies, this one has limitations, including its retrospective design and only evaluating data from one geographic region. Nevertheless, this was a fascinating and important addition to the literature. It is the first study to analyze clinical text from a paramedic narrative using natural language processing and machine learning to identify stroke patients. The results show this type of computing can use a paramedic narrative alone to identify stroke patients as well as or better than two validated prehospital stroke scales. The results of this study certainly support the need for more research in this area.
This study suggests that one day it may be possible to immediately analyze an EMS provider’s narrative to reliably identify stroke patients without the need for stroke scales. This may be possible with other time-critical illnesses too. Now, we are still a long way from that today. Many more studies with large generalizable data sets will need to be completed before then. However, the technological advances we continue to see have the potential to lead to better patient outcomes—and that is likely why most of you got into EMS in the first place.
Antonio R. Fernandez, PhD, NRP, FAHA, is a research scientist at ESO and serves on the board of advisors of the Prehospital Care Research Forum at UCLA.