Cargando…

Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts

BACKGROUND AND OBJECTIVE: Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify the...

Descripción completa

Detalles Bibliográficos
Autores principales: Segura-Bedmar, Isabel, Camino-Perdones, David, Guerrero-Aspizua, Sara
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9258216/
https://www.ncbi.nlm.nih.gov/pubmed/35794528
http://dx.doi.org/10.1186/s12859-022-04810-y
_version_ 1784741497601523712
author Segura-Bedmar, Isabel
Camino-Perdones, David
Guerrero-Aspizua, Sara
author_facet Segura-Bedmar, Isabel
Camino-Perdones, David
Guerrero-Aspizua, Sara
author_sort Segura-Bedmar, Isabel
collection PubMed
description BACKGROUND AND OBJECTIVE: Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify them. In addition to this, rare diseases usually show a wide variety of manifestations, which might make the diagnosis even more difficult. A delayed diagnosis can negatively affect the patient’s life. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) and Deep Learning can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments. METHODS: The paper explores several deep learning techniques such as Bidirectional Long Short Term Memory (BiLSTM) networks or deep contextualized word representations based on Bidirectional Encoder Representations from Transformers (BERT) to recognize rare diseases and their clinical manifestations (signs and symptoms). RESULTS: BioBERT, a domain-specific language representation based on BERT and trained on biomedical corpora, obtains the best results with an F1 of 85.2% for rare diseases. Since many signs are usually described by complex noun phrases that involve the use of use of overlapped, nested and discontinuous entities, the model provides lower results with an F1 of 57.2%. CONCLUSIONS: While our results are promising, there is still much room for improvement, especially with respect to the identification of clinical manifestations (signs and symptoms).
format Online
Article
Text
id pubmed-9258216
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-92582162022-07-07 Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts Segura-Bedmar, Isabel Camino-Perdones, David Guerrero-Aspizua, Sara BMC Bioinformatics Research BACKGROUND AND OBJECTIVE: Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify them. In addition to this, rare diseases usually show a wide variety of manifestations, which might make the diagnosis even more difficult. A delayed diagnosis can negatively affect the patient’s life. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) and Deep Learning can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments. METHODS: The paper explores several deep learning techniques such as Bidirectional Long Short Term Memory (BiLSTM) networks or deep contextualized word representations based on Bidirectional Encoder Representations from Transformers (BERT) to recognize rare diseases and their clinical manifestations (signs and symptoms). RESULTS: BioBERT, a domain-specific language representation based on BERT and trained on biomedical corpora, obtains the best results with an F1 of 85.2% for rare diseases. Since many signs are usually described by complex noun phrases that involve the use of use of overlapped, nested and discontinuous entities, the model provides lower results with an F1 of 57.2%. CONCLUSIONS: While our results are promising, there is still much room for improvement, especially with respect to the identification of clinical manifestations (signs and symptoms). BioMed Central 2022-07-06 /pmc/articles/PMC9258216/ /pubmed/35794528 http://dx.doi.org/10.1186/s12859-022-04810-y Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Segura-Bedmar, Isabel
Camino-Perdones, David
Guerrero-Aspizua, Sara
Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title_full Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title_fullStr Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title_full_unstemmed Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title_short Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title_sort exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9258216/
https://www.ncbi.nlm.nih.gov/pubmed/35794528
http://dx.doi.org/10.1186/s12859-022-04810-y
work_keys_str_mv AT segurabedmarisabel exploringdeeplearningmethodsforrecognizingrarediseasesandtheirclinicalmanifestationsfromtexts
AT caminoperdonesdavid exploringdeeplearningmethodsforrecognizingrarediseasesandtheirclinicalmanifestationsfromtexts
AT guerreroaspizuasara exploringdeeplearningmethodsforrecognizingrarediseasesandtheirclinicalmanifestationsfromtexts