Cargando…

tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles

The breadth and depth of biomedical literature are increasing year upon year. To keep abreast of these increases, FlyBase, a database for Drosophila genomic and genetic information, is constantly exploring new ways to mine the published literature to increase the efficiency and accuracy of manual cu...

Descripción completa

Detalles Bibliográficos
Autores principales: Cejuela, Juan Miguel, McQuilton, Peter, Ponting, Laura, Marygold, Steven J., Stefancsik, Raymund, Millburn, Gillian H., Rost, Burkhard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3978375/
https://www.ncbi.nlm.nih.gov/pubmed/24715220
http://dx.doi.org/10.1093/database/bau033
_version_ 1782310555999010816
author Cejuela, Juan Miguel
McQuilton, Peter
Ponting, Laura
Marygold, Steven J.
Stefancsik, Raymund
Millburn, Gillian H.
Rost, Burkhard
author_facet Cejuela, Juan Miguel
McQuilton, Peter
Ponting, Laura
Marygold, Steven J.
Stefancsik, Raymund
Millburn, Gillian H.
Rost, Burkhard
author_sort Cejuela, Juan Miguel
collection PubMed
description The breadth and depth of biomedical literature are increasing year upon year. To keep abreast of these increases, FlyBase, a database for Drosophila genomic and genetic information, is constantly exploring new ways to mine the published literature to increase the efficiency and accuracy of manual curation and to automate some aspects, such as triaging and entity extraction. Toward this end, we present the ‘tagtog’ system, a web-based annotation framework that can be used to mark up biological entities (such as genes) and concepts (such as Gene Ontology terms) in full-text articles. tagtog leverages manual user annotation in combination with automatic machine-learned annotation to provide accurate identification of gene symbols and gene names. As part of the BioCreative IV Interactive Annotation Task, FlyBase has used tagtog to identify and extract mentions of Drosophila melanogaster gene symbols and names in full-text biomedical articles from the PLOS stable of journals. We show here the results of three experiments with different sized corpora and assess gene recognition performance and curation speed. We conclude that tagtog-named entity recognition improves with a larger corpus and that tagtog-assisted curation is quicker than manual curation. Database URL: www.tagtog.net, www.flybase.org
format Online
Article
Text
id pubmed-3978375
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-39783752014-04-09 tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles Cejuela, Juan Miguel McQuilton, Peter Ponting, Laura Marygold, Steven J. Stefancsik, Raymund Millburn, Gillian H. Rost, Burkhard Database (Oxford) Original Article The breadth and depth of biomedical literature are increasing year upon year. To keep abreast of these increases, FlyBase, a database for Drosophila genomic and genetic information, is constantly exploring new ways to mine the published literature to increase the efficiency and accuracy of manual curation and to automate some aspects, such as triaging and entity extraction. Toward this end, we present the ‘tagtog’ system, a web-based annotation framework that can be used to mark up biological entities (such as genes) and concepts (such as Gene Ontology terms) in full-text articles. tagtog leverages manual user annotation in combination with automatic machine-learned annotation to provide accurate identification of gene symbols and gene names. As part of the BioCreative IV Interactive Annotation Task, FlyBase has used tagtog to identify and extract mentions of Drosophila melanogaster gene symbols and names in full-text biomedical articles from the PLOS stable of journals. We show here the results of three experiments with different sized corpora and assess gene recognition performance and curation speed. We conclude that tagtog-named entity recognition improves with a larger corpus and that tagtog-assisted curation is quicker than manual curation. Database URL: www.tagtog.net, www.flybase.org Oxford University Press 2014-04-07 /pmc/articles/PMC3978375/ /pubmed/24715220 http://dx.doi.org/10.1093/database/bau033 Text en © The Author(s) 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Cejuela, Juan Miguel
McQuilton, Peter
Ponting, Laura
Marygold, Steven J.
Stefancsik, Raymund
Millburn, Gillian H.
Rost, Burkhard
tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles
title tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles
title_full tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles
title_fullStr tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles
title_full_unstemmed tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles
title_short tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles
title_sort tagtog: interactive and text-mining-assisted annotation of gene mentions in plos full-text articles
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3978375/
https://www.ncbi.nlm.nih.gov/pubmed/24715220
http://dx.doi.org/10.1093/database/bau033
work_keys_str_mv AT cejuelajuanmiguel tagtoginteractiveandtextminingassistedannotationofgenementionsinplosfulltextarticles
AT mcquiltonpeter tagtoginteractiveandtextminingassistedannotationofgenementionsinplosfulltextarticles
AT pontinglaura tagtoginteractiveandtextminingassistedannotationofgenementionsinplosfulltextarticles
AT marygoldstevenj tagtoginteractiveandtextminingassistedannotationofgenementionsinplosfulltextarticles
AT stefancsikraymund tagtoginteractiveandtextminingassistedannotationofgenementionsinplosfulltextarticles
AT millburngillianh tagtoginteractiveandtextminingassistedannotationofgenementionsinplosfulltextarticles
AT rostburkhard tagtoginteractiveandtextminingassistedannotationofgenementionsinplosfulltextarticles
AT tagtoginteractiveandtextminingassistedannotationofgenementionsinplosfulltextarticles