Cargando…

Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models

The rising prevalence of diabetes and the increasing awareness of self-health management have resulted in a surge in diabetes patients seeking health information and emotional support in online health communities. Consequently, there is a vast database of patient consultation information in these on...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Qian, Zhou, Yue, Liao, Bolin, Xin, Zirui, Xie, Wenzhao, Hu, Chao, Luo, Aijing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10295683/
https://www.ncbi.nlm.nih.gov/pubmed/37370590
http://dx.doi.org/10.3390/bioengineering10060659
_version_ 1785063479622762496
author Xu, Qian
Zhou, Yue
Liao, Bolin
Xin, Zirui
Xie, Wenzhao
Hu, Chao
Luo, Aijing
author_facet Xu, Qian
Zhou, Yue
Liao, Bolin
Xin, Zirui
Xie, Wenzhao
Hu, Chao
Luo, Aijing
author_sort Xu, Qian
collection PubMed
description The rising prevalence of diabetes and the increasing awareness of self-health management have resulted in a surge in diabetes patients seeking health information and emotional support in online health communities. Consequently, there is a vast database of patient consultation information in these online health communities. However, due to the heterogeneity and incompleteness of the content, mining medical information and patient health data from these communities can be a challenge. To address this issue, we built the RoBERTa-BiLSTM-CRF (RBC) model for identifying entities in the online health community of diabetes. We selected 1889 question–answer texts from the most active online health community in China, Good Doctor Online, and used these public data to identify five types of entities. In addition, we conducted a comparative evaluation with three other commonly used models to validate the performance of our proposed model, including RoBERTa-CRF (RC), BilSTM-CRF (BC), and RoBERTa-Softmax (RS). The results showed that the RBC model achieved excellent performance on the test set, with an accuracy of 81.2% and an F1 score of 80.7%, outperforming the performance of traditional entity recognition models in named entity recognition in online medical communities for doctors and diabetes patients. The high performance of entity recognition in online health communities will provide a crucial knowledge source for constructing medical knowledge graphs. This integration would help alleviate the growing demand for medical consultations and the strain on healthcare resources, while assisting healthcare professionals in making informed decisions and providing personalized services to patients.
format Online
Article
Text
id pubmed-10295683
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102956832023-06-28 Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models Xu, Qian Zhou, Yue Liao, Bolin Xin, Zirui Xie, Wenzhao Hu, Chao Luo, Aijing Bioengineering (Basel) Article The rising prevalence of diabetes and the increasing awareness of self-health management have resulted in a surge in diabetes patients seeking health information and emotional support in online health communities. Consequently, there is a vast database of patient consultation information in these online health communities. However, due to the heterogeneity and incompleteness of the content, mining medical information and patient health data from these communities can be a challenge. To address this issue, we built the RoBERTa-BiLSTM-CRF (RBC) model for identifying entities in the online health community of diabetes. We selected 1889 question–answer texts from the most active online health community in China, Good Doctor Online, and used these public data to identify five types of entities. In addition, we conducted a comparative evaluation with three other commonly used models to validate the performance of our proposed model, including RoBERTa-CRF (RC), BilSTM-CRF (BC), and RoBERTa-Softmax (RS). The results showed that the RBC model achieved excellent performance on the test set, with an accuracy of 81.2% and an F1 score of 80.7%, outperforming the performance of traditional entity recognition models in named entity recognition in online medical communities for doctors and diabetes patients. The high performance of entity recognition in online health communities will provide a crucial knowledge source for constructing medical knowledge graphs. This integration would help alleviate the growing demand for medical consultations and the strain on healthcare resources, while assisting healthcare professionals in making informed decisions and providing personalized services to patients. MDPI 2023-05-29 /pmc/articles/PMC10295683/ /pubmed/37370590 http://dx.doi.org/10.3390/bioengineering10060659 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Xu, Qian
Zhou, Yue
Liao, Bolin
Xin, Zirui
Xie, Wenzhao
Hu, Chao
Luo, Aijing
Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models
title Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models
title_full Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models
title_fullStr Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models
title_full_unstemmed Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models
title_short Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models
title_sort named entity recognition of diabetes online health community data using multiple machine learning models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10295683/
https://www.ncbi.nlm.nih.gov/pubmed/37370590
http://dx.doi.org/10.3390/bioengineering10060659
work_keys_str_mv AT xuqian namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels
AT zhouyue namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels
AT liaobolin namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels
AT xinzirui namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels
AT xiewenzhao namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels
AT huchao namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels
AT luoaijing namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels