Cargando…

Ranking relations between diseases, drugs and genes for a curation task

BACKGROUND: One of the key pieces of information which biomedical text mining systems are expected to extract from the literature are interactions among different types of biomedical entities (proteins, genes, diseases, drugs, etc.). Several large resources of curated relations between biomedical en...

Descripción completa

Detalles Bibliográficos
Autores principales:	Clematide, Simon, Rinaldi, Fabio
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3465213/ https://www.ncbi.nlm.nih.gov/pubmed/23046495 http://dx.doi.org/10.1186/2041-1480-3-S3-S5

_version_	1782245529110970368
author	Clematide, Simon Rinaldi, Fabio
author_facet	Clematide, Simon Rinaldi, Fabio
author_sort	Clematide, Simon
collection	PubMed
description	BACKGROUND: One of the key pieces of information which biomedical text mining systems are expected to extract from the literature are interactions among different types of biomedical entities (proteins, genes, diseases, drugs, etc.). Several large resources of curated relations between biomedical entities are currently available, such as the Pharmacogenomics Knowledge Base (PharmGKB) or the Comparative Toxicogenomics Database (CTD). Biomedical text mining systems, and in particular those which deal with the extraction of relationships among entities, could make better use of the wealth of already curated material. RESULTS: We propose a simple and effective method based on logistic regression (also known as maximum entropy modeling) for an optimized ranking of relation candidates utilizing curated abstracts. Furthermore, we examine the effects and difficulties of using widely available metadata (i.e. MeSH terms and chemical substance index terms) for relation extraction. Cross-validation experiments result in an improvement of the ranking quality in terms of AUCiP/R by 39% (PharmGKB) and 116% (CTD) against a frequency-based baseline of 0.39 (PharmGKB) and 0.21 (CTD). For the TAP-10 metrics, we achieve an improvement of 53% (PharmGKB) and 134% (CTD) against the same baseline system (0.21 PharmGKB and 0.15 CTD). CONCLUSIONS: Our experiments with the PharmGKB and the CTD database show a strong positive effect for the ranking of relation candidates utilizing the vast amount of curated relations covered by currently available knowledge databases. The tasks of concept identification and candidate relation generation profit from the adaptation to previously curated material. This presents an effective and practical method suitable for conservative extension and re-validation of biomedical relations from texts that has been successfully used for curation experiments with the PharmGKB and CTD database.
format	Online Article Text
id	pubmed-3465213
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-34652132012-10-18 Ranking relations between diseases, drugs and genes for a curation task Clematide, Simon Rinaldi, Fabio J Biomed Semantics Research BACKGROUND: One of the key pieces of information which biomedical text mining systems are expected to extract from the literature are interactions among different types of biomedical entities (proteins, genes, diseases, drugs, etc.). Several large resources of curated relations between biomedical entities are currently available, such as the Pharmacogenomics Knowledge Base (PharmGKB) or the Comparative Toxicogenomics Database (CTD). Biomedical text mining systems, and in particular those which deal with the extraction of relationships among entities, could make better use of the wealth of already curated material. RESULTS: We propose a simple and effective method based on logistic regression (also known as maximum entropy modeling) for an optimized ranking of relation candidates utilizing curated abstracts. Furthermore, we examine the effects and difficulties of using widely available metadata (i.e. MeSH terms and chemical substance index terms) for relation extraction. Cross-validation experiments result in an improvement of the ranking quality in terms of AUCiP/R by 39% (PharmGKB) and 116% (CTD) against a frequency-based baseline of 0.39 (PharmGKB) and 0.21 (CTD). For the TAP-10 metrics, we achieve an improvement of 53% (PharmGKB) and 134% (CTD) against the same baseline system (0.21 PharmGKB and 0.15 CTD). CONCLUSIONS: Our experiments with the PharmGKB and the CTD database show a strong positive effect for the ranking of relation candidates utilizing the vast amount of curated relations covered by currently available knowledge databases. The tasks of concept identification and candidate relation generation profit from the adaptation to previously curated material. This presents an effective and practical method suitable for conservative extension and re-validation of biomedical relations from texts that has been successfully used for curation experiments with the PharmGKB and CTD database. BioMed Central 2012-10-05 /pmc/articles/PMC3465213/ /pubmed/23046495 http://dx.doi.org/10.1186/2041-1480-3-S3-S5 Text en Copyright ©2012 Clematide and Rinaldi; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Clematide, Simon Rinaldi, Fabio Ranking relations between diseases, drugs and genes for a curation task
title	Ranking relations between diseases, drugs and genes for a curation task
title_full	Ranking relations between diseases, drugs and genes for a curation task
title_fullStr	Ranking relations between diseases, drugs and genes for a curation task
title_full_unstemmed	Ranking relations between diseases, drugs and genes for a curation task
title_short	Ranking relations between diseases, drugs and genes for a curation task
title_sort	ranking relations between diseases, drugs and genes for a curation task
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3465213/ https://www.ncbi.nlm.nih.gov/pubmed/23046495 http://dx.doi.org/10.1186/2041-1480-3-S3-S5
work_keys_str_mv	AT clematidesimon rankingrelationsbetweendiseasesdrugsandgenesforacurationtask AT rinaldifabio rankingrelationsbetweendiseasesdrugsandgenesforacurationtask

Ranking relations between diseases, drugs and genes for a curation task

Ejemplares similares