Cargando…

Large-scale directional relationship extraction and resolution

BACKGROUND: Relationships between entities such as genes, chemicals, metabolites, phenotypes and diseases in MEDLINE are often directional. That is, one may affect the other in a positive or negative manner. Detection of causality and direction is key in piecing pathways together and in examining po...

Descripción completa

Detalles Bibliográficos
Autores principales: Giles, Cory B, Wren, Jonathan D
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2537562/
https://www.ncbi.nlm.nih.gov/pubmed/18793456
http://dx.doi.org/10.1186/1471-2105-9-S9-S11
_version_ 1782159107283746816
author Giles, Cory B
Wren, Jonathan D
author_facet Giles, Cory B
Wren, Jonathan D
author_sort Giles, Cory B
collection PubMed
description BACKGROUND: Relationships between entities such as genes, chemicals, metabolites, phenotypes and diseases in MEDLINE are often directional. That is, one may affect the other in a positive or negative manner. Detection of causality and direction is key in piecing pathways together and in examining possible implications of experimental results. Because of the size and growth of biomedical literature, it is increasingly important to be able to automate this process as much as possible. RESULTS: Here we present a method of relation extraction using dependency graph parsing with SVM classification. We tested the SVM classifier first on gold standard corpora from GENIA and find it achieved 82% precision and 94.8% recall (F-measure of 87.9) on these standardized test sets. We then applied the entire system to all available MEDLINE abstracts for two target interactions with known effects. We find that while some directional relations are extracted with low ambiguity, others are apparently contradictory, at least when considered in an isolated context. When examined, it is apparent some are dependent upon the surrounding context (e.g. whether the relationship referred to short-term or long-term effects, or whether the focus was extracellular versus intracellular). CONCLUSION: Thesaurus-based directional relation extraction can be done with reasonable accuracy, but is prone to false-positives on larger corpora due to noun modifiers. Furthermore, methods of resolving or disambiguating relationship context and contingencies are important for large-scale corpora.
format Text
id pubmed-2537562
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25375622008-09-17 Large-scale directional relationship extraction and resolution Giles, Cory B Wren, Jonathan D BMC Bioinformatics Proceedings BACKGROUND: Relationships between entities such as genes, chemicals, metabolites, phenotypes and diseases in MEDLINE are often directional. That is, one may affect the other in a positive or negative manner. Detection of causality and direction is key in piecing pathways together and in examining possible implications of experimental results. Because of the size and growth of biomedical literature, it is increasingly important to be able to automate this process as much as possible. RESULTS: Here we present a method of relation extraction using dependency graph parsing with SVM classification. We tested the SVM classifier first on gold standard corpora from GENIA and find it achieved 82% precision and 94.8% recall (F-measure of 87.9) on these standardized test sets. We then applied the entire system to all available MEDLINE abstracts for two target interactions with known effects. We find that while some directional relations are extracted with low ambiguity, others are apparently contradictory, at least when considered in an isolated context. When examined, it is apparent some are dependent upon the surrounding context (e.g. whether the relationship referred to short-term or long-term effects, or whether the focus was extracellular versus intracellular). CONCLUSION: Thesaurus-based directional relation extraction can be done with reasonable accuracy, but is prone to false-positives on larger corpora due to noun modifiers. Furthermore, methods of resolving or disambiguating relationship context and contingencies are important for large-scale corpora. BioMed Central 2008-08-12 /pmc/articles/PMC2537562/ /pubmed/18793456 http://dx.doi.org/10.1186/1471-2105-9-S9-S11 Text en Copyright © 2008 Giles and Wren; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Giles, Cory B
Wren, Jonathan D
Large-scale directional relationship extraction and resolution
title Large-scale directional relationship extraction and resolution
title_full Large-scale directional relationship extraction and resolution
title_fullStr Large-scale directional relationship extraction and resolution
title_full_unstemmed Large-scale directional relationship extraction and resolution
title_short Large-scale directional relationship extraction and resolution
title_sort large-scale directional relationship extraction and resolution
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2537562/
https://www.ncbi.nlm.nih.gov/pubmed/18793456
http://dx.doi.org/10.1186/1471-2105-9-S9-S11
work_keys_str_mv AT gilescoryb largescaledirectionalrelationshipextractionandresolution
AT wrenjonathand largescaledirectionalrelationshipextractionandresolution