Cargando…
Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques
Heart disease remains the major cause of death, despite recent improvements in prediction and prevention. Risk factor identification is the main step in diagnosing and preventing heart disease. Automatically detecting risk factors for heart disease in clinical notes can help with disease progression...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10156668/ https://www.ncbi.nlm.nih.gov/pubmed/37138014 http://dx.doi.org/10.1038/s41598-023-34294-6 |
_version_ | 1785036587147460608 |
---|---|
author | Houssein, Essam H. Mohamed, Rehab E. Ali, Abdelmgeid A. |
author_facet | Houssein, Essam H. Mohamed, Rehab E. Ali, Abdelmgeid A. |
author_sort | Houssein, Essam H. |
collection | PubMed |
description | Heart disease remains the major cause of death, despite recent improvements in prediction and prevention. Risk factor identification is the main step in diagnosing and preventing heart disease. Automatically detecting risk factors for heart disease in clinical notes can help with disease progression modeling and clinical decision-making. Many studies have attempted to detect risk factors for heart disease, but none have identified all risk factors. These studies have proposed hybrid systems that combine knowledge-driven and data-driven techniques, based on dictionaries, rules, and machine learning methods that require significant human effort. The National Center for Informatics for Integrating Biology and Beyond (i2b2) proposed a clinical natural language processing (NLP) challenge in 2014, with a track (track2) focused on detecting risk factors for heart disease risk factors in clinical notes over time. Clinical narratives provide a wealth of information that can be extracted using NLP and Deep Learning techniques. The objective of this paper is to improve on previous work in this area as part of the 2014 i2b2 challenge by identifying tags and attributes relevant to disease diagnosis, risk factors, and medications by providing advanced techniques of using stacked word embeddings. The i2b2 heart disease risk factors challenge dataset has shown significant improvement by using the approach of stacking embeddings, which combines various embeddings. Our model achieved an F1 score of 93.66% by using BERT and character embeddings (CHARACTER-BERT Embedding) stacking. The proposed model has significant results compared to all other models and systems that we developed for the 2014 i2b2 challenge. |
format | Online Article Text |
id | pubmed-10156668 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-101566682023-05-05 Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques Houssein, Essam H. Mohamed, Rehab E. Ali, Abdelmgeid A. Sci Rep Article Heart disease remains the major cause of death, despite recent improvements in prediction and prevention. Risk factor identification is the main step in diagnosing and preventing heart disease. Automatically detecting risk factors for heart disease in clinical notes can help with disease progression modeling and clinical decision-making. Many studies have attempted to detect risk factors for heart disease, but none have identified all risk factors. These studies have proposed hybrid systems that combine knowledge-driven and data-driven techniques, based on dictionaries, rules, and machine learning methods that require significant human effort. The National Center for Informatics for Integrating Biology and Beyond (i2b2) proposed a clinical natural language processing (NLP) challenge in 2014, with a track (track2) focused on detecting risk factors for heart disease risk factors in clinical notes over time. Clinical narratives provide a wealth of information that can be extracted using NLP and Deep Learning techniques. The objective of this paper is to improve on previous work in this area as part of the 2014 i2b2 challenge by identifying tags and attributes relevant to disease diagnosis, risk factors, and medications by providing advanced techniques of using stacked word embeddings. The i2b2 heart disease risk factors challenge dataset has shown significant improvement by using the approach of stacking embeddings, which combines various embeddings. Our model achieved an F1 score of 93.66% by using BERT and character embeddings (CHARACTER-BERT Embedding) stacking. The proposed model has significant results compared to all other models and systems that we developed for the 2014 i2b2 challenge. Nature Publishing Group UK 2023-05-03 /pmc/articles/PMC10156668/ /pubmed/37138014 http://dx.doi.org/10.1038/s41598-023-34294-6 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Houssein, Essam H. Mohamed, Rehab E. Ali, Abdelmgeid A. Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques |
title | Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques |
title_full | Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques |
title_fullStr | Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques |
title_full_unstemmed | Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques |
title_short | Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques |
title_sort | heart disease risk factors detection from electronic health records using advanced nlp and deep learning techniques |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10156668/ https://www.ncbi.nlm.nih.gov/pubmed/37138014 http://dx.doi.org/10.1038/s41598-023-34294-6 |
work_keys_str_mv | AT housseinessamh heartdiseaseriskfactorsdetectionfromelectronichealthrecordsusingadvancednlpanddeeplearningtechniques AT mohamedrehabe heartdiseaseriskfactorsdetectionfromelectronichealthrecordsusingadvancednlpanddeeplearningtechniques AT aliabdelmgeida heartdiseaseriskfactorsdetectionfromelectronichealthrecordsusingadvancednlpanddeeplearningtechniques |