Cargando…

Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach

BACKGROUND: Incorrectly annotated sequence data are becoming more commonplace as databases increasingly rely on automated techniques for annotation. Hence, there is an urgent need for computational methods for checking consistency of such annotations against independent sources of evidence and detec...

Descripción completa

Detalles Bibliográficos
Autores principales:	Andorf, Carson, Dobbs, Drena, Honavar, Vasant
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Correspondence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1994202/ https://www.ncbi.nlm.nih.gov/pubmed/17683567 http://dx.doi.org/10.1186/1471-2105-8-284

_version_	1782135481922748416
author	Andorf, Carson Dobbs, Drena Honavar, Vasant
author_facet	Andorf, Carson Dobbs, Drena Honavar, Vasant
author_sort	Andorf, Carson
collection	PubMed
description	BACKGROUND: Incorrectly annotated sequence data are becoming more commonplace as databases increasingly rely on automated techniques for annotation. Hence, there is an urgent need for computational methods for checking consistency of such annotations against independent sources of evidence and detecting potential annotation errors. We show how a machine learning approach designed to automatically predict a protein's Gene Ontology (GO) functional class can be employed to identify potential gene annotation errors. RESULTS: In a set of 211 previously annotated mouse protein kinases, we found that 201 of the GO annotations returned by AmiGO appear to be inconsistent with the UniProt functions assigned to their human counterparts. In contrast, 97% of the predicted annotations generated using a machine learning approach were consistent with the UniProt annotations of the human counterparts, as well as with available annotations for these mouse protein kinases in the Mouse Kinome database. CONCLUSION: We conjecture that most of our predicted annotations are, therefore, correct and suggest that the machine learning approach developed here could be routinely used to detect potential errors in GO annotations generated by high-throughput gene annotation projects. Editors Note : Authors from the original publication (Okazaki et al.: Nature 2002, 420:563–73) have provided their response to Andorf et al, directly following the correspondence.
format	Text
id	pubmed-1994202
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-19942022007-09-26 Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach Andorf, Carson Dobbs, Drena Honavar, Vasant BMC Bioinformatics Correspondence BACKGROUND: Incorrectly annotated sequence data are becoming more commonplace as databases increasingly rely on automated techniques for annotation. Hence, there is an urgent need for computational methods for checking consistency of such annotations against independent sources of evidence and detecting potential annotation errors. We show how a machine learning approach designed to automatically predict a protein's Gene Ontology (GO) functional class can be employed to identify potential gene annotation errors. RESULTS: In a set of 211 previously annotated mouse protein kinases, we found that 201 of the GO annotations returned by AmiGO appear to be inconsistent with the UniProt functions assigned to their human counterparts. In contrast, 97% of the predicted annotations generated using a machine learning approach were consistent with the UniProt annotations of the human counterparts, as well as with available annotations for these mouse protein kinases in the Mouse Kinome database. CONCLUSION: We conjecture that most of our predicted annotations are, therefore, correct and suggest that the machine learning approach developed here could be routinely used to detect potential errors in GO annotations generated by high-throughput gene annotation projects. Editors Note : Authors from the original publication (Okazaki et al.: Nature 2002, 420:563–73) have provided their response to Andorf et al, directly following the correspondence. BioMed Central 2007-08-03 /pmc/articles/PMC1994202/ /pubmed/17683567 http://dx.doi.org/10.1186/1471-2105-8-284 Text en Copyright © 2007 Andorf et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Correspondence Andorf, Carson Dobbs, Drena Honavar, Vasant Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach
title	Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach
title_full	Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach
title_fullStr	Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach
title_full_unstemmed	Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach
title_short	Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach
title_sort	exploring inconsistencies in genome-wide protein function annotations: a machine learning approach
topic	Correspondence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1994202/ https://www.ncbi.nlm.nih.gov/pubmed/17683567 http://dx.doi.org/10.1186/1471-2105-8-284
work_keys_str_mv	AT andorfcarson exploringinconsistenciesingenomewideproteinfunctionannotationsamachinelearningapproach AT dobbsdrena exploringinconsistenciesingenomewideproteinfunctionannotationsamachinelearningapproach AT honavarvasant exploringinconsistenciesingenomewideproteinfunctionannotationsamachinelearningapproach

Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach

Ejemplares similares