Cargando…

Overview of BioCreAtIvE task 1B: normalized gene lists

BACKGROUND: Our goal in BioCreAtIve has been to assess the state of the art in text mining, with emphasis on applications that reflect real biological applications, e.g., the curation process for model organism databases. This paper summarizes the BioCreAtIvE task 1B, the "Normalized Gene List&...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hirschman, Lynette, Colosimo, Marc, Morgan, Alexander, Yeh, Alexander
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2005
Materias:	Report
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1869004/ https://www.ncbi.nlm.nih.gov/pubmed/15960823 http://dx.doi.org/10.1186/1471-2105-6-S1-S11

_version_	1782133424836837376
author	Hirschman, Lynette Colosimo, Marc Morgan, Alexander Yeh, Alexander
author_facet	Hirschman, Lynette Colosimo, Marc Morgan, Alexander Yeh, Alexander
author_sort	Hirschman, Lynette
collection	PubMed
description	BACKGROUND: Our goal in BioCreAtIve has been to assess the state of the art in text mining, with emphasis on applications that reflect real biological applications, e.g., the curation process for model organism databases. This paper summarizes the BioCreAtIvE task 1B, the "Normalized Gene List" task, which was inspired by the gene list supplied for each curated paper in a model organism database. The task was to produce the correct list of unique gene identifiers for the genes and gene products mentioned in sets of abstracts from three model organisms (Yeast, Fly, and Mouse). RESULTS: Eight groups fielded systems for three data sets (Yeast, Fly, and Mouse). For Yeast, the top scoring system (out of 15) achieved 0.92 F-measure (harmonic mean of precision and recall); for Mouse and Fly, the task was more difficult, due to larger numbers of genes, more ambiguity in the gene naming conventions (particularly for Fly), and complex gene names (for Mouse). For Fly, the top F-measure was 0.82 out of 11 systems and for Mouse, it was 0.79 out of 16 systems. CONCLUSION: This assessment demonstrates that multiple groups were able to perform a real biological task across a range of organisms. The performance was dependent on the organism, and specifically on the naming conventions associated with each organism. These results hold out promise that the technology can provide partial automation of the curation process in the near future.
format	Text
id	pubmed-1869004
institution	National Center for Biotechnology Information
language	English
publishDate	2005
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-18690042007-05-18 Overview of BioCreAtIvE task 1B: normalized gene lists Hirschman, Lynette Colosimo, Marc Morgan, Alexander Yeh, Alexander BMC Bioinformatics Report BACKGROUND: Our goal in BioCreAtIve has been to assess the state of the art in text mining, with emphasis on applications that reflect real biological applications, e.g., the curation process for model organism databases. This paper summarizes the BioCreAtIvE task 1B, the "Normalized Gene List" task, which was inspired by the gene list supplied for each curated paper in a model organism database. The task was to produce the correct list of unique gene identifiers for the genes and gene products mentioned in sets of abstracts from three model organisms (Yeast, Fly, and Mouse). RESULTS: Eight groups fielded systems for three data sets (Yeast, Fly, and Mouse). For Yeast, the top scoring system (out of 15) achieved 0.92 F-measure (harmonic mean of precision and recall); for Mouse and Fly, the task was more difficult, due to larger numbers of genes, more ambiguity in the gene naming conventions (particularly for Fly), and complex gene names (for Mouse). For Fly, the top F-measure was 0.82 out of 11 systems and for Mouse, it was 0.79 out of 16 systems. CONCLUSION: This assessment demonstrates that multiple groups were able to perform a real biological task across a range of organisms. The performance was dependent on the organism, and specifically on the naming conventions associated with each organism. These results hold out promise that the technology can provide partial automation of the curation process in the near future. BioMed Central 2005-05-24 /pmc/articles/PMC1869004/ /pubmed/15960823 http://dx.doi.org/10.1186/1471-2105-6-S1-S11 Text en Copyright © 2005 Hirschman et al; licensee BioMed Central Ltd http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Report Hirschman, Lynette Colosimo, Marc Morgan, Alexander Yeh, Alexander Overview of BioCreAtIvE task 1B: normalized gene lists
title	Overview of BioCreAtIvE task 1B: normalized gene lists
title_full	Overview of BioCreAtIvE task 1B: normalized gene lists
title_fullStr	Overview of BioCreAtIvE task 1B: normalized gene lists
title_full_unstemmed	Overview of BioCreAtIvE task 1B: normalized gene lists
title_short	Overview of BioCreAtIvE task 1B: normalized gene lists
title_sort	overview of biocreative task 1b: normalized gene lists
topic	Report
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1869004/ https://www.ncbi.nlm.nih.gov/pubmed/15960823 http://dx.doi.org/10.1186/1471-2105-6-S1-S11
work_keys_str_mv	AT hirschmanlynette overviewofbiocreativetask1bnormalizedgenelists AT colosimomarc overviewofbiocreativetask1bnormalizedgenelists AT morganalexander overviewofbiocreativetask1bnormalizedgenelists AT yehalexander overviewofbiocreativetask1bnormalizedgenelists

Overview of BioCreAtIvE task 1B: normalized gene lists

Ejemplares similares