Cargando…
Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
BACKGROUND AND OBJECTIVE: Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify the...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9258216/ https://www.ncbi.nlm.nih.gov/pubmed/35794528 http://dx.doi.org/10.1186/s12859-022-04810-y |
_version_ | 1784741497601523712 |
---|---|
author | Segura-Bedmar, Isabel Camino-Perdones, David Guerrero-Aspizua, Sara |
author_facet | Segura-Bedmar, Isabel Camino-Perdones, David Guerrero-Aspizua, Sara |
author_sort | Segura-Bedmar, Isabel |
collection | PubMed |
description | BACKGROUND AND OBJECTIVE: Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify them. In addition to this, rare diseases usually show a wide variety of manifestations, which might make the diagnosis even more difficult. A delayed diagnosis can negatively affect the patient’s life. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) and Deep Learning can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments. METHODS: The paper explores several deep learning techniques such as Bidirectional Long Short Term Memory (BiLSTM) networks or deep contextualized word representations based on Bidirectional Encoder Representations from Transformers (BERT) to recognize rare diseases and their clinical manifestations (signs and symptoms). RESULTS: BioBERT, a domain-specific language representation based on BERT and trained on biomedical corpora, obtains the best results with an F1 of 85.2% for rare diseases. Since many signs are usually described by complex noun phrases that involve the use of use of overlapped, nested and discontinuous entities, the model provides lower results with an F1 of 57.2%. CONCLUSIONS: While our results are promising, there is still much room for improvement, especially with respect to the identification of clinical manifestations (signs and symptoms). |
format | Online Article Text |
id | pubmed-9258216 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-92582162022-07-07 Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts Segura-Bedmar, Isabel Camino-Perdones, David Guerrero-Aspizua, Sara BMC Bioinformatics Research BACKGROUND AND OBJECTIVE: Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify them. In addition to this, rare diseases usually show a wide variety of manifestations, which might make the diagnosis even more difficult. A delayed diagnosis can negatively affect the patient’s life. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) and Deep Learning can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments. METHODS: The paper explores several deep learning techniques such as Bidirectional Long Short Term Memory (BiLSTM) networks or deep contextualized word representations based on Bidirectional Encoder Representations from Transformers (BERT) to recognize rare diseases and their clinical manifestations (signs and symptoms). RESULTS: BioBERT, a domain-specific language representation based on BERT and trained on biomedical corpora, obtains the best results with an F1 of 85.2% for rare diseases. Since many signs are usually described by complex noun phrases that involve the use of use of overlapped, nested and discontinuous entities, the model provides lower results with an F1 of 57.2%. CONCLUSIONS: While our results are promising, there is still much room for improvement, especially with respect to the identification of clinical manifestations (signs and symptoms). BioMed Central 2022-07-06 /pmc/articles/PMC9258216/ /pubmed/35794528 http://dx.doi.org/10.1186/s12859-022-04810-y Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Segura-Bedmar, Isabel Camino-Perdones, David Guerrero-Aspizua, Sara Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts |
title | Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts |
title_full | Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts |
title_fullStr | Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts |
title_full_unstemmed | Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts |
title_short | Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts |
title_sort | exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9258216/ https://www.ncbi.nlm.nih.gov/pubmed/35794528 http://dx.doi.org/10.1186/s12859-022-04810-y |
work_keys_str_mv | AT segurabedmarisabel exploringdeeplearningmethodsforrecognizingrarediseasesandtheirclinicalmanifestationsfromtexts AT caminoperdonesdavid exploringdeeplearningmethodsforrecognizingrarediseasesandtheirclinicalmanifestationsfromtexts AT guerreroaspizuasara exploringdeeplearningmethodsforrecognizingrarediseasesandtheirclinicalmanifestationsfromtexts |