Cargando…

GOAnnotator: linking protein GO annotations to evidence text

BACKGROUND: Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated autom...

Descripción completa

Detalles Bibliográficos
Autores principales: Couto, Francisco M, Silva, Mário J, Lee, Vivian, Dimmer, Emily, Camon, Evelyn, Apweiler, Rolf, Kirsch, Harald, Rebholz-Schuhmann, Dietrich
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1769513/
https://www.ncbi.nlm.nih.gov/pubmed/17181854
http://dx.doi.org/10.1186/1747-5333-1-19
_version_ 1782131706957922304
author Couto, Francisco M
Silva, Mário J
Lee, Vivian
Dimmer, Emily
Camon, Evelyn
Apweiler, Rolf
Kirsch, Harald
Rebholz-Schuhmann, Dietrich
author_facet Couto, Francisco M
Silva, Mário J
Lee, Vivian
Dimmer, Emily
Camon, Evelyn
Apweiler, Rolf
Kirsch, Harald
Rebholz-Schuhmann, Dietrich
author_sort Couto, Francisco M
collection PubMed
description BACKGROUND: Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated automatically. Text-mining systems that use literature for automatic annotation have been proposed but they do not satisfy the high quality expectations of curators. RESULTS: In this paper we describe an approach that links uncurated annotations to text extracted from literature. The selection of the text is based on the similarity of the text to the term from the uncurated annotation. Besides substantiating the uncurated annotations, the extracted texts also lead to novel annotations. In addition, the approach uses the GO hierarchy to achieve high precision. Our approach is integrated into GOAnnotator, a tool that assists the curation process for GO annotation of UniProt proteins. CONCLUSION: The GO curators assessed GOAnnotator with a set of 66 distinct UniProt/SwissProt proteins with uncurated annotations. GOAnnotator provided correct evidence text at 93% precision. This high precision results from using the GO hierarchy to only select GO terms similar to GO terms from uncurated annotations in GOA. Our approach is the first one to achieve high precision, which is crucial for the efficient support of GO curators. GOAnnotator was implemented as a web tool that is freely available at .
format Text
id pubmed-1769513
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17695132007-01-16 GOAnnotator: linking protein GO annotations to evidence text Couto, Francisco M Silva, Mário J Lee, Vivian Dimmer, Emily Camon, Evelyn Apweiler, Rolf Kirsch, Harald Rebholz-Schuhmann, Dietrich J Biomed Discov Collab Software BACKGROUND: Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated automatically. Text-mining systems that use literature for automatic annotation have been proposed but they do not satisfy the high quality expectations of curators. RESULTS: In this paper we describe an approach that links uncurated annotations to text extracted from literature. The selection of the text is based on the similarity of the text to the term from the uncurated annotation. Besides substantiating the uncurated annotations, the extracted texts also lead to novel annotations. In addition, the approach uses the GO hierarchy to achieve high precision. Our approach is integrated into GOAnnotator, a tool that assists the curation process for GO annotation of UniProt proteins. CONCLUSION: The GO curators assessed GOAnnotator with a set of 66 distinct UniProt/SwissProt proteins with uncurated annotations. GOAnnotator provided correct evidence text at 93% precision. This high precision results from using the GO hierarchy to only select GO terms similar to GO terms from uncurated annotations in GOA. Our approach is the first one to achieve high precision, which is crucial for the efficient support of GO curators. GOAnnotator was implemented as a web tool that is freely available at . BioMed Central 2006-12-20 /pmc/articles/PMC1769513/ /pubmed/17181854 http://dx.doi.org/10.1186/1747-5333-1-19 Text en Copyright © 2006 Couto et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Couto, Francisco M
Silva, Mário J
Lee, Vivian
Dimmer, Emily
Camon, Evelyn
Apweiler, Rolf
Kirsch, Harald
Rebholz-Schuhmann, Dietrich
GOAnnotator: linking protein GO annotations to evidence text
title GOAnnotator: linking protein GO annotations to evidence text
title_full GOAnnotator: linking protein GO annotations to evidence text
title_fullStr GOAnnotator: linking protein GO annotations to evidence text
title_full_unstemmed GOAnnotator: linking protein GO annotations to evidence text
title_short GOAnnotator: linking protein GO annotations to evidence text
title_sort goannotator: linking protein go annotations to evidence text
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1769513/
https://www.ncbi.nlm.nih.gov/pubmed/17181854
http://dx.doi.org/10.1186/1747-5333-1-19
work_keys_str_mv AT coutofranciscom goannotatorlinkingproteingoannotationstoevidencetext
AT silvamarioj goannotatorlinkingproteingoannotationstoevidencetext
AT leevivian goannotatorlinkingproteingoannotationstoevidencetext
AT dimmeremily goannotatorlinkingproteingoannotationstoevidencetext
AT camonevelyn goannotatorlinkingproteingoannotationstoevidencetext
AT apweilerrolf goannotatorlinkingproteingoannotationstoevidencetext
AT kirschharald goannotatorlinkingproteingoannotationstoevidencetext
AT rebholzschuhmanndietrich goannotatorlinkingproteingoannotationstoevidencetext