Cargando…
A fast, resource efficient, and reliable rule-based system for COVID-19 symptom identification
OBJECTIVE: With COVID-19, there was a need for a rapidly scalable annotation system that facilitated real-time integration with clinical decision support systems (CDS). Current annotation systems suffer from a high-resource utilization and poor scalability limiting real-world integration with CDS. A...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8374371/ https://www.ncbi.nlm.nih.gov/pubmed/34423261 http://dx.doi.org/10.1093/jamiaopen/ooab070 |
_version_ | 1783740101921079296 |
---|---|
author | Sahoo, Himanshu S Silverman, Greg M Ingraham, Nicholas E Lupei, Monica I Puskarich, Michael A Finzel, Raymond L Sartori, John Zhang, Rui Knoll, Benjamin C Liu, Sijia Liu, Hongfang Melton, Genevieve B Tignanelli, Christopher J Pakhomov, Serguei V S |
author_facet | Sahoo, Himanshu S Silverman, Greg M Ingraham, Nicholas E Lupei, Monica I Puskarich, Michael A Finzel, Raymond L Sartori, John Zhang, Rui Knoll, Benjamin C Liu, Sijia Liu, Hongfang Melton, Genevieve B Tignanelli, Christopher J Pakhomov, Serguei V S |
author_sort | Sahoo, Himanshu S |
collection | PubMed |
description | OBJECTIVE: With COVID-19, there was a need for a rapidly scalable annotation system that facilitated real-time integration with clinical decision support systems (CDS). Current annotation systems suffer from a high-resource utilization and poor scalability limiting real-world integration with CDS. A potential solution to mitigate these issues is to use the rule-based gazetteer developed at our institution. MATERIALS AND METHODS: Performance, resource utilization, and runtime of the rule-based gazetteer were compared with five annotation systems: BioMedICUS, cTAKES, MetaMap, CLAMP, and MedTagger. RESULTS: This rule-based gazetteer was the fastest, had a low resource footprint, and similar performance for weighted microaverage and macroaverage measures of precision, recall, and f1-score compared to other annotation systems. DISCUSSION: Opportunities to increase its performance include fine-tuning lexical rules for symptom identification. Additionally, it could run on multiple compute nodes for faster runtime. CONCLUSION: This rule-based gazetteer overcame key technical limitations facilitating real-time symptomatology identification for COVID-19 and integration of unstructured data elements into our CDS. It is ideal for large-scale deployment across a wide variety of healthcare settings for surveillance of acute COVID-19 symptoms for integration into prognostic modeling. Such a system is currently being leveraged for monitoring of postacute sequelae of COVID-19 (PASC) progression in COVID-19 survivors. This study conducted the first in-depth analysis and developed a rule-based gazetteer for COVID-19 symptom extraction with the following key features: low processor and memory utilization, faster runtime, and similar weighted microaverage and macroaverage measures for precision, recall, and f1-score compared to industry-standard annotation systems. |
format | Online Article Text |
id | pubmed-8374371 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-83743712021-08-20 A fast, resource efficient, and reliable rule-based system for COVID-19 symptom identification Sahoo, Himanshu S Silverman, Greg M Ingraham, Nicholas E Lupei, Monica I Puskarich, Michael A Finzel, Raymond L Sartori, John Zhang, Rui Knoll, Benjamin C Liu, Sijia Liu, Hongfang Melton, Genevieve B Tignanelli, Christopher J Pakhomov, Serguei V S JAMIA Open Research and Applications OBJECTIVE: With COVID-19, there was a need for a rapidly scalable annotation system that facilitated real-time integration with clinical decision support systems (CDS). Current annotation systems suffer from a high-resource utilization and poor scalability limiting real-world integration with CDS. A potential solution to mitigate these issues is to use the rule-based gazetteer developed at our institution. MATERIALS AND METHODS: Performance, resource utilization, and runtime of the rule-based gazetteer were compared with five annotation systems: BioMedICUS, cTAKES, MetaMap, CLAMP, and MedTagger. RESULTS: This rule-based gazetteer was the fastest, had a low resource footprint, and similar performance for weighted microaverage and macroaverage measures of precision, recall, and f1-score compared to other annotation systems. DISCUSSION: Opportunities to increase its performance include fine-tuning lexical rules for symptom identification. Additionally, it could run on multiple compute nodes for faster runtime. CONCLUSION: This rule-based gazetteer overcame key technical limitations facilitating real-time symptomatology identification for COVID-19 and integration of unstructured data elements into our CDS. It is ideal for large-scale deployment across a wide variety of healthcare settings for surveillance of acute COVID-19 symptoms for integration into prognostic modeling. Such a system is currently being leveraged for monitoring of postacute sequelae of COVID-19 (PASC) progression in COVID-19 survivors. This study conducted the first in-depth analysis and developed a rule-based gazetteer for COVID-19 symptom extraction with the following key features: low processor and memory utilization, faster runtime, and similar weighted microaverage and macroaverage measures for precision, recall, and f1-score compared to industry-standard annotation systems. Oxford University Press 2021-08-07 /pmc/articles/PMC8374371/ /pubmed/34423261 http://dx.doi.org/10.1093/jamiaopen/ooab070 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Research and Applications Sahoo, Himanshu S Silverman, Greg M Ingraham, Nicholas E Lupei, Monica I Puskarich, Michael A Finzel, Raymond L Sartori, John Zhang, Rui Knoll, Benjamin C Liu, Sijia Liu, Hongfang Melton, Genevieve B Tignanelli, Christopher J Pakhomov, Serguei V S A fast, resource efficient, and reliable rule-based system for COVID-19 symptom identification |
title | A fast, resource efficient, and reliable rule-based system for COVID-19 symptom identification |
title_full | A fast, resource efficient, and reliable rule-based system for COVID-19 symptom identification |
title_fullStr | A fast, resource efficient, and reliable rule-based system for COVID-19 symptom identification |
title_full_unstemmed | A fast, resource efficient, and reliable rule-based system for COVID-19 symptom identification |
title_short | A fast, resource efficient, and reliable rule-based system for COVID-19 symptom identification |
title_sort | fast, resource efficient, and reliable rule-based system for covid-19 symptom identification |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8374371/ https://www.ncbi.nlm.nih.gov/pubmed/34423261 http://dx.doi.org/10.1093/jamiaopen/ooab070 |
work_keys_str_mv | AT sahoohimanshus afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT silvermangregm afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT ingrahamnicholase afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT lupeimonicai afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT puskarichmichaela afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT finzelraymondl afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT sartorijohn afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT zhangrui afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT knollbenjaminc afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT liusijia afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT liuhongfang afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT meltongenevieveb afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT tignanellichristopherj afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT pakhomovsergueivs afastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT sahoohimanshus fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT silvermangregm fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT ingrahamnicholase fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT lupeimonicai fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT puskarichmichaela fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT finzelraymondl fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT sartorijohn fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT zhangrui fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT knollbenjaminc fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT liusijia fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT liuhongfang fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT meltongenevieveb fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT tignanellichristopherj fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification AT pakhomovsergueivs fastresourceefficientandreliablerulebasedsystemforcovid19symptomidentification |