Cargando…

Opportunities for text mining in the FlyBase genetic literature curation workflow

FlyBase is the model organism database for Drosophila genetic and genomic information. Over the last 20 years, FlyBase has had to adapt and change to keep abreast of advances in biology and database design. We are continually looking for ways to improve curation efficiency and efficacy. Genetic lite...

Descripción completa

Detalles Bibliográficos
Autor principal: McQuilton, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3500518/
https://www.ncbi.nlm.nih.gov/pubmed/23160412
http://dx.doi.org/10.1093/database/bas039
_version_ 1782250116279697408
author McQuilton, Peter
author_facet McQuilton, Peter
author_sort McQuilton, Peter
collection PubMed
description FlyBase is the model organism database for Drosophila genetic and genomic information. Over the last 20 years, FlyBase has had to adapt and change to keep abreast of advances in biology and database design. We are continually looking for ways to improve curation efficiency and efficacy. Genetic literature curation focuses on the extraction of genetic entities (e.g. genes, mutant alleles, transgenic constructs) and their associated phenotypes and Gene Ontology terms from the published literature. Over 2000 Drosophila research articles are now published every year. These articles are becoming ever more data-rich and there is a growing need for text mining to shoulder some of the burden of paper triage and data extraction. In this article, we describe our curation workflow, along with some of the problems and bottlenecks therein, and highlight the opportunities for text mining. We do so in the hope of encouraging the BioCreative community to help us to develop effective methods to mine this torrent of information. Database URL: http://flybase.org
format Online
Article
Text
id pubmed-3500518
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-35005182012-11-19 Opportunities for text mining in the FlyBase genetic literature curation workflow McQuilton, Peter Database (Oxford) BioCreative Virtual Issue FlyBase is the model organism database for Drosophila genetic and genomic information. Over the last 20 years, FlyBase has had to adapt and change to keep abreast of advances in biology and database design. We are continually looking for ways to improve curation efficiency and efficacy. Genetic literature curation focuses on the extraction of genetic entities (e.g. genes, mutant alleles, transgenic constructs) and their associated phenotypes and Gene Ontology terms from the published literature. Over 2000 Drosophila research articles are now published every year. These articles are becoming ever more data-rich and there is a growing need for text mining to shoulder some of the burden of paper triage and data extraction. In this article, we describe our curation workflow, along with some of the problems and bottlenecks therein, and highlight the opportunities for text mining. We do so in the hope of encouraging the BioCreative community to help us to develop effective methods to mine this torrent of information. Database URL: http://flybase.org Oxford University Press 2012-11-15 /pmc/articles/PMC3500518/ /pubmed/23160412 http://dx.doi.org/10.1093/database/bas039 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com.
spellingShingle BioCreative Virtual Issue
McQuilton, Peter
Opportunities for text mining in the FlyBase genetic literature curation workflow
title Opportunities for text mining in the FlyBase genetic literature curation workflow
title_full Opportunities for text mining in the FlyBase genetic literature curation workflow
title_fullStr Opportunities for text mining in the FlyBase genetic literature curation workflow
title_full_unstemmed Opportunities for text mining in the FlyBase genetic literature curation workflow
title_short Opportunities for text mining in the FlyBase genetic literature curation workflow
title_sort opportunities for text mining in the flybase genetic literature curation workflow
topic BioCreative Virtual Issue
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3500518/
https://www.ncbi.nlm.nih.gov/pubmed/23160412
http://dx.doi.org/10.1093/database/bas039
work_keys_str_mv AT mcquiltonpeter opportunitiesfortextminingintheflybasegeneticliteraturecurationworkflow