Cargando…
Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models
The rising prevalence of diabetes and the increasing awareness of self-health management have resulted in a surge in diabetes patients seeking health information and emotional support in online health communities. Consequently, there is a vast database of patient consultation information in these on...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10295683/ https://www.ncbi.nlm.nih.gov/pubmed/37370590 http://dx.doi.org/10.3390/bioengineering10060659 |
_version_ | 1785063479622762496 |
---|---|
author | Xu, Qian Zhou, Yue Liao, Bolin Xin, Zirui Xie, Wenzhao Hu, Chao Luo, Aijing |
author_facet | Xu, Qian Zhou, Yue Liao, Bolin Xin, Zirui Xie, Wenzhao Hu, Chao Luo, Aijing |
author_sort | Xu, Qian |
collection | PubMed |
description | The rising prevalence of diabetes and the increasing awareness of self-health management have resulted in a surge in diabetes patients seeking health information and emotional support in online health communities. Consequently, there is a vast database of patient consultation information in these online health communities. However, due to the heterogeneity and incompleteness of the content, mining medical information and patient health data from these communities can be a challenge. To address this issue, we built the RoBERTa-BiLSTM-CRF (RBC) model for identifying entities in the online health community of diabetes. We selected 1889 question–answer texts from the most active online health community in China, Good Doctor Online, and used these public data to identify five types of entities. In addition, we conducted a comparative evaluation with three other commonly used models to validate the performance of our proposed model, including RoBERTa-CRF (RC), BilSTM-CRF (BC), and RoBERTa-Softmax (RS). The results showed that the RBC model achieved excellent performance on the test set, with an accuracy of 81.2% and an F1 score of 80.7%, outperforming the performance of traditional entity recognition models in named entity recognition in online medical communities for doctors and diabetes patients. The high performance of entity recognition in online health communities will provide a crucial knowledge source for constructing medical knowledge graphs. This integration would help alleviate the growing demand for medical consultations and the strain on healthcare resources, while assisting healthcare professionals in making informed decisions and providing personalized services to patients. |
format | Online Article Text |
id | pubmed-10295683 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-102956832023-06-28 Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models Xu, Qian Zhou, Yue Liao, Bolin Xin, Zirui Xie, Wenzhao Hu, Chao Luo, Aijing Bioengineering (Basel) Article The rising prevalence of diabetes and the increasing awareness of self-health management have resulted in a surge in diabetes patients seeking health information and emotional support in online health communities. Consequently, there is a vast database of patient consultation information in these online health communities. However, due to the heterogeneity and incompleteness of the content, mining medical information and patient health data from these communities can be a challenge. To address this issue, we built the RoBERTa-BiLSTM-CRF (RBC) model for identifying entities in the online health community of diabetes. We selected 1889 question–answer texts from the most active online health community in China, Good Doctor Online, and used these public data to identify five types of entities. In addition, we conducted a comparative evaluation with three other commonly used models to validate the performance of our proposed model, including RoBERTa-CRF (RC), BilSTM-CRF (BC), and RoBERTa-Softmax (RS). The results showed that the RBC model achieved excellent performance on the test set, with an accuracy of 81.2% and an F1 score of 80.7%, outperforming the performance of traditional entity recognition models in named entity recognition in online medical communities for doctors and diabetes patients. The high performance of entity recognition in online health communities will provide a crucial knowledge source for constructing medical knowledge graphs. This integration would help alleviate the growing demand for medical consultations and the strain on healthcare resources, while assisting healthcare professionals in making informed decisions and providing personalized services to patients. MDPI 2023-05-29 /pmc/articles/PMC10295683/ /pubmed/37370590 http://dx.doi.org/10.3390/bioengineering10060659 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Xu, Qian Zhou, Yue Liao, Bolin Xin, Zirui Xie, Wenzhao Hu, Chao Luo, Aijing Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models |
title | Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models |
title_full | Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models |
title_fullStr | Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models |
title_full_unstemmed | Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models |
title_short | Named Entity Recognition of Diabetes Online Health Community Data Using Multiple Machine Learning Models |
title_sort | named entity recognition of diabetes online health community data using multiple machine learning models |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10295683/ https://www.ncbi.nlm.nih.gov/pubmed/37370590 http://dx.doi.org/10.3390/bioengineering10060659 |
work_keys_str_mv | AT xuqian namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels AT zhouyue namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels AT liaobolin namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels AT xinzirui namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels AT xiewenzhao namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels AT huchao namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels AT luoaijing namedentityrecognitionofdiabetesonlinehealthcommunitydatausingmultiplemachinelearningmodels |