Cargando…

Multi-criteria text mining model for COVID-19 testing reasons and symptoms and temporal predictive model for COVID-19 test results in rural communities

This study is conducted to build a multi-criteria text mining model for COVID-19 testing reasons and symptoms. The model is integrated with a temporal predictive classification model for COVID-19 test results in rural underserved areas. A dataset of 6895 testing appointments and 14 features is used...

Descripción completa

Detalles Bibliográficos
Autores principales:	Abu Lekham, Laith, Wang, Yong, Hey, Ellen, Khasawneh, Mohammad T.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer London 2022
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8729325/ https://www.ncbi.nlm.nih.gov/pubmed/35013649 http://dx.doi.org/10.1007/s00521-021-06884-w

_version_	1784626915920838656
author	Abu Lekham, Laith Wang, Yong Hey, Ellen Khasawneh, Mohammad T.
author_facet	Abu Lekham, Laith Wang, Yong Hey, Ellen Khasawneh, Mohammad T.
author_sort	Abu Lekham, Laith
collection	PubMed
description	This study is conducted to build a multi-criteria text mining model for COVID-19 testing reasons and symptoms. The model is integrated with a temporal predictive classification model for COVID-19 test results in rural underserved areas. A dataset of 6895 testing appointments and 14 features is used in this study. The text mining model classifies the notes related to the testing reasons and reported symptoms into one or more categories using look-up wordlists and a multi-criteria mapping process. The model converts an unstructured feature to a categorical feature that is used in building the temporal predictive classification model for COVID-19 test results and conducting some population analytics. The classification model is a temporal model (ordered and indexed by testing date) that uses machine learning classifiers to predict test results that are either positive or negative. Two types of classifiers and performance measures that include balanced and regular methods are used: (1) balanced random forest and (2) balanced bagged decision tree. The balanced or weighted methods are used to address and account for the biased and imbalanced dataset and to ensure correct detection of patients with COVID-19 (minority class). The model is tested in two stages using validation and testing sets to ensure robustness and reliability. The balanced classifiers outperformed regular classifiers using the balanced performance measures (balanced accuracy and G-score), which means the balanced classifiers are better at detecting patients with positive COVID-19 results. The balanced random forest achieved the best average balanced accuracy (86.1%) and G-score (86.1%) using the validation set. The balanced bagged decision tree achieved the best average balanced accuracy (83.0%) and G-score (82.8%) using the testing set. Also, it was found that the patient history, age, testing reasons, and time are the key features to classify the testing results.
format	Online Article Text
id	pubmed-8729325
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Springer London
record_format	MEDLINE/PubMed
spelling	pubmed-87293252022-01-06 Multi-criteria text mining model for COVID-19 testing reasons and symptoms and temporal predictive model for COVID-19 test results in rural communities Abu Lekham, Laith Wang, Yong Hey, Ellen Khasawneh, Mohammad T. Neural Comput Appl Original Article This study is conducted to build a multi-criteria text mining model for COVID-19 testing reasons and symptoms. The model is integrated with a temporal predictive classification model for COVID-19 test results in rural underserved areas. A dataset of 6895 testing appointments and 14 features is used in this study. The text mining model classifies the notes related to the testing reasons and reported symptoms into one or more categories using look-up wordlists and a multi-criteria mapping process. The model converts an unstructured feature to a categorical feature that is used in building the temporal predictive classification model for COVID-19 test results and conducting some population analytics. The classification model is a temporal model (ordered and indexed by testing date) that uses machine learning classifiers to predict test results that are either positive or negative. Two types of classifiers and performance measures that include balanced and regular methods are used: (1) balanced random forest and (2) balanced bagged decision tree. The balanced or weighted methods are used to address and account for the biased and imbalanced dataset and to ensure correct detection of patients with COVID-19 (minority class). The model is tested in two stages using validation and testing sets to ensure robustness and reliability. The balanced classifiers outperformed regular classifiers using the balanced performance measures (balanced accuracy and G-score), which means the balanced classifiers are better at detecting patients with positive COVID-19 results. The balanced random forest achieved the best average balanced accuracy (86.1%) and G-score (86.1%) using the validation set. The balanced bagged decision tree achieved the best average balanced accuracy (83.0%) and G-score (82.8%) using the testing set. Also, it was found that the patient history, age, testing reasons, and time are the key features to classify the testing results. Springer London 2022-01-05 2022 /pmc/articles/PMC8729325/ /pubmed/35013649 http://dx.doi.org/10.1007/s00521-021-06884-w Text en © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Original Article Abu Lekham, Laith Wang, Yong Hey, Ellen Khasawneh, Mohammad T. Multi-criteria text mining model for COVID-19 testing reasons and symptoms and temporal predictive model for COVID-19 test results in rural communities
title	Multi-criteria text mining model for COVID-19 testing reasons and symptoms and temporal predictive model for COVID-19 test results in rural communities
title_full	Multi-criteria text mining model for COVID-19 testing reasons and symptoms and temporal predictive model for COVID-19 test results in rural communities
title_fullStr	Multi-criteria text mining model for COVID-19 testing reasons and symptoms and temporal predictive model for COVID-19 test results in rural communities
title_full_unstemmed	Multi-criteria text mining model for COVID-19 testing reasons and symptoms and temporal predictive model for COVID-19 test results in rural communities
title_short	Multi-criteria text mining model for COVID-19 testing reasons and symptoms and temporal predictive model for COVID-19 test results in rural communities
title_sort	multi-criteria text mining model for covid-19 testing reasons and symptoms and temporal predictive model for covid-19 test results in rural communities
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8729325/ https://www.ncbi.nlm.nih.gov/pubmed/35013649 http://dx.doi.org/10.1007/s00521-021-06884-w
work_keys_str_mv	AT abulekhamlaith multicriteriatextminingmodelforcovid19testingreasonsandsymptomsandtemporalpredictivemodelforcovid19testresultsinruralcommunities AT wangyong multicriteriatextminingmodelforcovid19testingreasonsandsymptomsandtemporalpredictivemodelforcovid19testresultsinruralcommunities AT heyellen multicriteriatextminingmodelforcovid19testingreasonsandsymptomsandtemporalpredictivemodelforcovid19testresultsinruralcommunities AT khasawnehmohammadt multicriteriatextminingmodelforcovid19testingreasonsandsymptomsandtemporalpredictivemodelforcovid19testresultsinruralcommunities

Multi-criteria text mining model for COVID-19 testing reasons and symptoms and temporal predictive model for COVID-19 test results in rural communities

Ejemplares similares