Cargando…

An environment for relation mining over richly annotated corpora: the case of GENIA

BACKGROUND: The biomedical domain is witnessing a rapid growth of the amount of published scientific results, which makes it increasingly difficult to filter the core information. There is a real need for support tools that 'digest' the published results and extract the most important info...

Descripción completa

Detalles Bibliográficos
Autores principales: Rinaldi, Fabio, Schneider, Gerold, Kaljurand, Kaarel, Hess, Michael, Romacker, Martin
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1764447/
https://www.ncbi.nlm.nih.gov/pubmed/17134476
http://dx.doi.org/10.1186/1471-2105-7-S3-S3
_version_ 1782131615540969472
author Rinaldi, Fabio
Schneider, Gerold
Kaljurand, Kaarel
Hess, Michael
Romacker, Martin
author_facet Rinaldi, Fabio
Schneider, Gerold
Kaljurand, Kaarel
Hess, Michael
Romacker, Martin
author_sort Rinaldi, Fabio
collection PubMed
description BACKGROUND: The biomedical domain is witnessing a rapid growth of the amount of published scientific results, which makes it increasingly difficult to filter the core information. There is a real need for support tools that 'digest' the published results and extract the most important information. RESULTS: We describe and evaluate an environment supporting the extraction of domain-specific relations, such as protein-protein interactions, from a richly-annotated corpus. We use full, deep-linguistic parsing and manually created, versatile patterns, expressing a large set of syntactic alternations, plus semantic ontology information. CONCLUSION: The experiments show that our approach described is capable of delivering high-precision results, while maintaining sufficient levels of recall. The high level of abstraction of the rules used by the system, which are considerably more powerful and versatile than finite-state approaches, allows speedy interactive development and validation.
format Text
id pubmed-1764447
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17644472007-01-09 An environment for relation mining over richly annotated corpora: the case of GENIA Rinaldi, Fabio Schneider, Gerold Kaljurand, Kaarel Hess, Michael Romacker, Martin BMC Bioinformatics Proceedings BACKGROUND: The biomedical domain is witnessing a rapid growth of the amount of published scientific results, which makes it increasingly difficult to filter the core information. There is a real need for support tools that 'digest' the published results and extract the most important information. RESULTS: We describe and evaluate an environment supporting the extraction of domain-specific relations, such as protein-protein interactions, from a richly-annotated corpus. We use full, deep-linguistic parsing and manually created, versatile patterns, expressing a large set of syntactic alternations, plus semantic ontology information. CONCLUSION: The experiments show that our approach described is capable of delivering high-precision results, while maintaining sufficient levels of recall. The high level of abstraction of the rules used by the system, which are considerably more powerful and versatile than finite-state approaches, allows speedy interactive development and validation. BioMed Central 2006-11-24 /pmc/articles/PMC1764447/ /pubmed/17134476 http://dx.doi.org/10.1186/1471-2105-7-S3-S3 Text en Copyright © 2006 Rinaldi et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Rinaldi, Fabio
Schneider, Gerold
Kaljurand, Kaarel
Hess, Michael
Romacker, Martin
An environment for relation mining over richly annotated corpora: the case of GENIA
title An environment for relation mining over richly annotated corpora: the case of GENIA
title_full An environment for relation mining over richly annotated corpora: the case of GENIA
title_fullStr An environment for relation mining over richly annotated corpora: the case of GENIA
title_full_unstemmed An environment for relation mining over richly annotated corpora: the case of GENIA
title_short An environment for relation mining over richly annotated corpora: the case of GENIA
title_sort environment for relation mining over richly annotated corpora: the case of genia
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1764447/
https://www.ncbi.nlm.nih.gov/pubmed/17134476
http://dx.doi.org/10.1186/1471-2105-7-S3-S3
work_keys_str_mv AT rinaldifabio anenvironmentforrelationminingoverrichlyannotatedcorporathecaseofgenia
AT schneidergerold anenvironmentforrelationminingoverrichlyannotatedcorporathecaseofgenia
AT kaljurandkaarel anenvironmentforrelationminingoverrichlyannotatedcorporathecaseofgenia
AT hessmichael anenvironmentforrelationminingoverrichlyannotatedcorporathecaseofgenia
AT romackermartin anenvironmentforrelationminingoverrichlyannotatedcorporathecaseofgenia
AT rinaldifabio environmentforrelationminingoverrichlyannotatedcorporathecaseofgenia
AT schneidergerold environmentforrelationminingoverrichlyannotatedcorporathecaseofgenia
AT kaljurandkaarel environmentforrelationminingoverrichlyannotatedcorporathecaseofgenia
AT hessmichael environmentforrelationminingoverrichlyannotatedcorporathecaseofgenia
AT romackermartin environmentforrelationminingoverrichlyannotatedcorporathecaseofgenia