Cargando…
Developing a biocuration workflow for AgBase, a non-model organism database
AgBase provides annotation for agricultural gene products using the Gene Ontology (GO) and Plant Ontology, as appropriate. Unlike model organism species, agricultural species have a body of literature that does not just focus on gene function; to improve efficiency, we use text mining to identify li...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3500517/ https://www.ncbi.nlm.nih.gov/pubmed/23160411 http://dx.doi.org/10.1093/database/bas038 |
_version_ | 1782250116045864960 |
---|---|
author | Pillai, Lakshmi Chouvarine, Philippe Tudor, Catalina O. Schmidt, Carl J. Vijay-Shanker, K. McCarthy, Fiona M. |
author_facet | Pillai, Lakshmi Chouvarine, Philippe Tudor, Catalina O. Schmidt, Carl J. Vijay-Shanker, K. McCarthy, Fiona M. |
author_sort | Pillai, Lakshmi |
collection | PubMed |
description | AgBase provides annotation for agricultural gene products using the Gene Ontology (GO) and Plant Ontology, as appropriate. Unlike model organism species, agricultural species have a body of literature that does not just focus on gene function; to improve efficiency, we use text mining to identify literature for curation. The first component of our annotation interface is the gene prioritization interface that ranks gene products for annotation. Biocurators select the top-ranked gene and mark annotation for these genes as ‘in progress’ or ‘completed’; links enable biocurators to move directly to our biocuration interface (BI). Our BI includes all current GO annotation for gene products and is the main interface to add/modify AgBase curation data. The BI also displays Extracting Genic Information from Text (eGIFT) results for each gene product. eGIFT is a web-based, text-mining tool that associates ranked, informative terms (iTerms) and the articles and sentences containing them, with genes. Moreover, iTerms are linked to GO terms, where they match either a GO term name or a synonym. This enables AgBase biocurators to rapidly identify literature for further curation based on possible GO terms. Because most agricultural species do not have standardized literature, eGIFT searches all gene names and synonyms to associate articles with genes. As many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene, and filtering is applied to remove abstracts that mention a gene in passing. The BI is linked to our Journal Database (JDB) where corresponding journal citations are stored. Just as importantly, biocurators also add to the JDB citations that have no GO annotation. The AgBase BI also supports bulk annotation upload to facilitate our Inferred from electronic annotation of agricultural gene products. All annotations must pass standard GO Consortium quality checking before release in AgBase. Database URL: http://www.agbase.msstate.edu/ |
format | Online Article Text |
id | pubmed-3500517 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-35005172012-11-19 Developing a biocuration workflow for AgBase, a non-model organism database Pillai, Lakshmi Chouvarine, Philippe Tudor, Catalina O. Schmidt, Carl J. Vijay-Shanker, K. McCarthy, Fiona M. Database (Oxford) BioCreative Virtual Issue AgBase provides annotation for agricultural gene products using the Gene Ontology (GO) and Plant Ontology, as appropriate. Unlike model organism species, agricultural species have a body of literature that does not just focus on gene function; to improve efficiency, we use text mining to identify literature for curation. The first component of our annotation interface is the gene prioritization interface that ranks gene products for annotation. Biocurators select the top-ranked gene and mark annotation for these genes as ‘in progress’ or ‘completed’; links enable biocurators to move directly to our biocuration interface (BI). Our BI includes all current GO annotation for gene products and is the main interface to add/modify AgBase curation data. The BI also displays Extracting Genic Information from Text (eGIFT) results for each gene product. eGIFT is a web-based, text-mining tool that associates ranked, informative terms (iTerms) and the articles and sentences containing them, with genes. Moreover, iTerms are linked to GO terms, where they match either a GO term name or a synonym. This enables AgBase biocurators to rapidly identify literature for further curation based on possible GO terms. Because most agricultural species do not have standardized literature, eGIFT searches all gene names and synonyms to associate articles with genes. As many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene, and filtering is applied to remove abstracts that mention a gene in passing. The BI is linked to our Journal Database (JDB) where corresponding journal citations are stored. Just as importantly, biocurators also add to the JDB citations that have no GO annotation. The AgBase BI also supports bulk annotation upload to facilitate our Inferred from electronic annotation of agricultural gene products. All annotations must pass standard GO Consortium quality checking before release in AgBase. Database URL: http://www.agbase.msstate.edu/ Oxford University Press 2012-11-15 /pmc/articles/PMC3500517/ /pubmed/23160411 http://dx.doi.org/10.1093/database/bas038 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com. |
spellingShingle | BioCreative Virtual Issue Pillai, Lakshmi Chouvarine, Philippe Tudor, Catalina O. Schmidt, Carl J. Vijay-Shanker, K. McCarthy, Fiona M. Developing a biocuration workflow for AgBase, a non-model organism database |
title | Developing a biocuration workflow for AgBase, a non-model organism database |
title_full | Developing a biocuration workflow for AgBase, a non-model organism database |
title_fullStr | Developing a biocuration workflow for AgBase, a non-model organism database |
title_full_unstemmed | Developing a biocuration workflow for AgBase, a non-model organism database |
title_short | Developing a biocuration workflow for AgBase, a non-model organism database |
title_sort | developing a biocuration workflow for agbase, a non-model organism database |
topic | BioCreative Virtual Issue |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3500517/ https://www.ncbi.nlm.nih.gov/pubmed/23160411 http://dx.doi.org/10.1093/database/bas038 |
work_keys_str_mv | AT pillailakshmi developingabiocurationworkflowforagbaseanonmodelorganismdatabase AT chouvarinephilippe developingabiocurationworkflowforagbaseanonmodelorganismdatabase AT tudorcatalinao developingabiocurationworkflowforagbaseanonmodelorganismdatabase AT schmidtcarlj developingabiocurationworkflowforagbaseanonmodelorganismdatabase AT vijayshankerk developingabiocurationworkflowforagbaseanonmodelorganismdatabase AT mccarthyfionam developingabiocurationworkflowforagbaseanonmodelorganismdatabase |