Cargando…

Overview of BioCreAtIvE: critical assessment of information extraction for biology

BACKGROUND: The goal of the first BioCreAtIvE challenge (Critical Assessment of Information Extraction in Biology) was to provide a set of common evaluation tasks to assess the state of the art for text mining applied to biological problems. The results were presented in a workshop held in Granada,...

Descripción completa

Detalles Bibliográficos
Autores principales: Hirschman, Lynette, Yeh, Alexander, Blaschke, Christian, Valencia, Alfonso
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1869002/
https://www.ncbi.nlm.nih.gov/pubmed/15960821
http://dx.doi.org/10.1186/1471-2105-6-S1-S1
_version_ 1782133424377561088
author Hirschman, Lynette
Yeh, Alexander
Blaschke, Christian
Valencia, Alfonso
author_facet Hirschman, Lynette
Yeh, Alexander
Blaschke, Christian
Valencia, Alfonso
author_sort Hirschman, Lynette
collection PubMed
description BACKGROUND: The goal of the first BioCreAtIvE challenge (Critical Assessment of Information Extraction in Biology) was to provide a set of common evaluation tasks to assess the state of the art for text mining applied to biological problems. The results were presented in a workshop held in Granada, Spain March 28–31, 2004. The articles collected in this BMC Bioinformatics supplement entitled "A critical assessment of text mining methods in molecular biology" describe the BioCreAtIvE tasks, systems, results and their independent evaluation. RESULTS: BioCreAtIvE focused on two tasks. The first dealt with extraction of gene or protein names from text, and their mapping into standardized gene identifiers for three model organism databases (fly, mouse, yeast). The second task addressed issues of functional annotation, requiring systems to identify specific text passages that supported Gene Ontology annotations for specific proteins, given full text articles. CONCLUSION: The first BioCreAtIvE assessment achieved a high level of international participation (27 groups from 10 countries). The assessment provided state-of-the-art performance results for a basic task (gene name finding and normalization), where the best systems achieved a balanced 80% precision / recall or better, which potentially makes them suitable for real applications in biology. The results for the advanced task (functional annotation from free text) were significantly lower, demonstrating the current limitations of text-mining approaches where knowledge extrapolation and interpretation are required. In addition, an important contribution of BioCreAtIvE has been the creation and release of training and test data sets for both tasks. There are 22 articles in this special issue, including six that provide analyses of results or data quality for the data sets, including a novel inter-annotator consistency assessment for the test set used in task 2.
format Text
id pubmed-1869002
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18690022007-05-18 Overview of BioCreAtIvE: critical assessment of information extraction for biology Hirschman, Lynette Yeh, Alexander Blaschke, Christian Valencia, Alfonso BMC Bioinformatics Introduction BACKGROUND: The goal of the first BioCreAtIvE challenge (Critical Assessment of Information Extraction in Biology) was to provide a set of common evaluation tasks to assess the state of the art for text mining applied to biological problems. The results were presented in a workshop held in Granada, Spain March 28–31, 2004. The articles collected in this BMC Bioinformatics supplement entitled "A critical assessment of text mining methods in molecular biology" describe the BioCreAtIvE tasks, systems, results and their independent evaluation. RESULTS: BioCreAtIvE focused on two tasks. The first dealt with extraction of gene or protein names from text, and their mapping into standardized gene identifiers for three model organism databases (fly, mouse, yeast). The second task addressed issues of functional annotation, requiring systems to identify specific text passages that supported Gene Ontology annotations for specific proteins, given full text articles. CONCLUSION: The first BioCreAtIvE assessment achieved a high level of international participation (27 groups from 10 countries). The assessment provided state-of-the-art performance results for a basic task (gene name finding and normalization), where the best systems achieved a balanced 80% precision / recall or better, which potentially makes them suitable for real applications in biology. The results for the advanced task (functional annotation from free text) were significantly lower, demonstrating the current limitations of text-mining approaches where knowledge extrapolation and interpretation are required. In addition, an important contribution of BioCreAtIvE has been the creation and release of training and test data sets for both tasks. There are 22 articles in this special issue, including six that provide analyses of results or data quality for the data sets, including a novel inter-annotator consistency assessment for the test set used in task 2. BioMed Central 2005-05-24 /pmc/articles/PMC1869002/ /pubmed/15960821 http://dx.doi.org/10.1186/1471-2105-6-S1-S1 Text en Copyright © 2005 Hirschman et al; licensee BioMed Central Ltd http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Introduction
Hirschman, Lynette
Yeh, Alexander
Blaschke, Christian
Valencia, Alfonso
Overview of BioCreAtIvE: critical assessment of information extraction for biology
title Overview of BioCreAtIvE: critical assessment of information extraction for biology
title_full Overview of BioCreAtIvE: critical assessment of information extraction for biology
title_fullStr Overview of BioCreAtIvE: critical assessment of information extraction for biology
title_full_unstemmed Overview of BioCreAtIvE: critical assessment of information extraction for biology
title_short Overview of BioCreAtIvE: critical assessment of information extraction for biology
title_sort overview of biocreative: critical assessment of information extraction for biology
topic Introduction
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1869002/
https://www.ncbi.nlm.nih.gov/pubmed/15960821
http://dx.doi.org/10.1186/1471-2105-6-S1-S1
work_keys_str_mv AT hirschmanlynette overviewofbiocreativecriticalassessmentofinformationextractionforbiology
AT yehalexander overviewofbiocreativecriticalassessmentofinformationextractionforbiology
AT blaschkechristian overviewofbiocreativecriticalassessmentofinformationextractionforbiology
AT valenciaalfonso overviewofbiocreativecriticalassessmentofinformationextractionforbiology