Cargando…

BioCreAtIvE Task 1A: gene mention finding evaluation

BACKGROUND: The biological research literature is a major repository of knowledge. As the amount of literature increases, it will get harder to find the information of interest on a particular topic. There has been an increasing amount of work on text mining this literature, but comparing this work...

Descripción completa

Detalles Bibliográficos
Autores principales: Yeh, Alexander, Morgan, Alexander, Colosimo, Marc, Hirschman, Lynette
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1869012/
https://www.ncbi.nlm.nih.gov/pubmed/15960832
http://dx.doi.org/10.1186/1471-2105-6-S1-S2
_version_ 1782133427249610752
author Yeh, Alexander
Morgan, Alexander
Colosimo, Marc
Hirschman, Lynette
author_facet Yeh, Alexander
Morgan, Alexander
Colosimo, Marc
Hirschman, Lynette
author_sort Yeh, Alexander
collection PubMed
description BACKGROUND: The biological research literature is a major repository of knowledge. As the amount of literature increases, it will get harder to find the information of interest on a particular topic. There has been an increasing amount of work on text mining this literature, but comparing this work is hard because of a lack of standards for making comparisons. To address this, we worked with colleagues at the Protein Design Group, CNB-CSIC, Madrid to develop BioCreAtIvE (Critical Assessment for Information Extraction in Biology), an open common evaluation of systems on a number of biological text mining tasks. We report here on task 1A, which deals with finding mentions of genes and related entities in text. "Finding mentions" is a basic task, which can be used as a building block for other text mining tasks. The task makes use of data and evaluation software provided by the (US) National Center for Biotechnology Information (NCBI). RESULTS: 15 teams took part in task 1A. A number of teams achieved scores over 80% F-measure (balanced precision and recall). The teams that tried to use their task 1A systems to help on other BioCreAtIvE tasks reported mixed results. CONCLUSION: The 80% plus F-measure results are good, but still somewhat lag the best scores achieved in some other domains such as newswire, due in part to the complexity and length of gene names, compared to person or organization names in newswire.
format Text
id pubmed-1869012
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18690122007-05-18 BioCreAtIvE Task 1A: gene mention finding evaluation Yeh, Alexander Morgan, Alexander Colosimo, Marc Hirschman, Lynette BMC Bioinformatics Report BACKGROUND: The biological research literature is a major repository of knowledge. As the amount of literature increases, it will get harder to find the information of interest on a particular topic. There has been an increasing amount of work on text mining this literature, but comparing this work is hard because of a lack of standards for making comparisons. To address this, we worked with colleagues at the Protein Design Group, CNB-CSIC, Madrid to develop BioCreAtIvE (Critical Assessment for Information Extraction in Biology), an open common evaluation of systems on a number of biological text mining tasks. We report here on task 1A, which deals with finding mentions of genes and related entities in text. "Finding mentions" is a basic task, which can be used as a building block for other text mining tasks. The task makes use of data and evaluation software provided by the (US) National Center for Biotechnology Information (NCBI). RESULTS: 15 teams took part in task 1A. A number of teams achieved scores over 80% F-measure (balanced precision and recall). The teams that tried to use their task 1A systems to help on other BioCreAtIvE tasks reported mixed results. CONCLUSION: The 80% plus F-measure results are good, but still somewhat lag the best scores achieved in some other domains such as newswire, due in part to the complexity and length of gene names, compared to person or organization names in newswire. BioMed Central 2005-05-24 /pmc/articles/PMC1869012/ /pubmed/15960832 http://dx.doi.org/10.1186/1471-2105-6-S1-S2 Text en Copyright © 2005 Yeh et al; licensee BioMed Central Ltd http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Report
Yeh, Alexander
Morgan, Alexander
Colosimo, Marc
Hirschman, Lynette
BioCreAtIvE Task 1A: gene mention finding evaluation
title BioCreAtIvE Task 1A: gene mention finding evaluation
title_full BioCreAtIvE Task 1A: gene mention finding evaluation
title_fullStr BioCreAtIvE Task 1A: gene mention finding evaluation
title_full_unstemmed BioCreAtIvE Task 1A: gene mention finding evaluation
title_short BioCreAtIvE Task 1A: gene mention finding evaluation
title_sort biocreative task 1a: gene mention finding evaluation
topic Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1869012/
https://www.ncbi.nlm.nih.gov/pubmed/15960832
http://dx.doi.org/10.1186/1471-2105-6-S1-S2
work_keys_str_mv AT yehalexander biocreativetask1agenementionfindingevaluation
AT morganalexander biocreativetask1agenementionfindingevaluation
AT colosimomarc biocreativetask1agenementionfindingevaluation
AT hirschmanlynette biocreativetask1agenementionfindingevaluation