Cargando…

A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature

The biomedical literature represents a rich source of biomarker information. However, both the size of literature databases and their lack of standardization hamper the automatic exploitation of the information contained in these resources. Text mining approaches have proven to be useful for the exp...

Descripción completa

Detalles Bibliográficos
Autores principales: Bravo, À., Cases, M., Queralt-Rosinach, N., Sanz, F., Furlong, L. I.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4009255/
https://www.ncbi.nlm.nih.gov/pubmed/24839601
http://dx.doi.org/10.1155/2014/253128
_version_ 1782479736623071232
author Bravo, À.
Cases, M.
Queralt-Rosinach, N.
Sanz, F.
Furlong, L. I.
author_facet Bravo, À.
Cases, M.
Queralt-Rosinach, N.
Sanz, F.
Furlong, L. I.
author_sort Bravo, À.
collection PubMed
description The biomedical literature represents a rich source of biomarker information. However, both the size of literature databases and their lack of standardization hamper the automatic exploitation of the information contained in these resources. Text mining approaches have proven to be useful for the exploitation of information contained in the scientific publications. Here, we show that a knowledge-driven text mining approach can exploit a large literature database to extract a dataset of biomarkers related to diseases covering all therapeutic areas. Our methodology takes advantage of the annotation of MEDLINE publications pertaining to biomarkers with MeSH terms, narrowing the search to specific publications and, therefore, minimizing the false positive ratio. It is based on a dictionary-based named entity recognition system and a relation extraction module. The application of this methodology resulted in the identification of 131,012 disease-biomarker associations between 2,803 genes and 2,751 diseases, and represents a valuable knowledge base for those interested in disease-related biomarkers. Additionally, we present a bibliometric analysis of the journals reporting biomarker related information during the last 40 years.
format Online
Article
Text
id pubmed-4009255
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-40092552014-05-18 A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature Bravo, À. Cases, M. Queralt-Rosinach, N. Sanz, F. Furlong, L. I. Biomed Res Int Research Article The biomedical literature represents a rich source of biomarker information. However, both the size of literature databases and their lack of standardization hamper the automatic exploitation of the information contained in these resources. Text mining approaches have proven to be useful for the exploitation of information contained in the scientific publications. Here, we show that a knowledge-driven text mining approach can exploit a large literature database to extract a dataset of biomarkers related to diseases covering all therapeutic areas. Our methodology takes advantage of the annotation of MEDLINE publications pertaining to biomarkers with MeSH terms, narrowing the search to specific publications and, therefore, minimizing the false positive ratio. It is based on a dictionary-based named entity recognition system and a relation extraction module. The application of this methodology resulted in the identification of 131,012 disease-biomarker associations between 2,803 genes and 2,751 diseases, and represents a valuable knowledge base for those interested in disease-related biomarkers. Additionally, we present a bibliometric analysis of the journals reporting biomarker related information during the last 40 years. Hindawi Publishing Corporation 2014 2014-04-16 /pmc/articles/PMC4009255/ /pubmed/24839601 http://dx.doi.org/10.1155/2014/253128 Text en Copyright © 2014 À. Bravo et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Bravo, À.
Cases, M.
Queralt-Rosinach, N.
Sanz, F.
Furlong, L. I.
A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title_full A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title_fullStr A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title_full_unstemmed A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title_short A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title_sort knowledge-driven approach to extract disease-related biomarkers from the literature
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4009255/
https://www.ncbi.nlm.nih.gov/pubmed/24839601
http://dx.doi.org/10.1155/2014/253128
work_keys_str_mv AT bravoa aknowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT casesm aknowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT queraltrosinachn aknowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT sanzf aknowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT furlongli aknowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT bravoa knowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT casesm knowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT queraltrosinachn knowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT sanzf knowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT furlongli knowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature