Cargando…

NoGOA: predicting noisy GO annotations using evidences and sparse representation

BACKGROUND: Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yu, Guoxian, Lu, Chang, Wang, Jun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5521088/ https://www.ncbi.nlm.nih.gov/pubmed/28732468 http://dx.doi.org/10.1186/s12859-017-1764-z

_version_	1783251915230937088
author	Yu, Guoxian Lu, Chang Wang, Jun
author_facet	Yu, Guoxian Lu, Chang Wang, Jun
author_sort	Yu, Guoxian
collection	PubMed
description	BACKGROUND: Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. RESULTS: We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. CONCLUSIONS: The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1764-z) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5521088
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-55210882017-07-21 NoGOA: predicting noisy GO annotations using evidences and sparse representation Yu, Guoxian Lu, Chang Wang, Jun BMC Bioinformatics Methodology Article BACKGROUND: Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. RESULTS: We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. CONCLUSIONS: The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1764-z) contains supplementary material, which is available to authorized users. BioMed Central 2017-07-21 /pmc/articles/PMC5521088/ /pubmed/28732468 http://dx.doi.org/10.1186/s12859-017-1764-z Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Yu, Guoxian Lu, Chang Wang, Jun NoGOA: predicting noisy GO annotations using evidences and sparse representation
title	NoGOA: predicting noisy GO annotations using evidences and sparse representation
title_full	NoGOA: predicting noisy GO annotations using evidences and sparse representation
title_fullStr	NoGOA: predicting noisy GO annotations using evidences and sparse representation
title_full_unstemmed	NoGOA: predicting noisy GO annotations using evidences and sparse representation
title_short	NoGOA: predicting noisy GO annotations using evidences and sparse representation
title_sort	nogoa: predicting noisy go annotations using evidences and sparse representation
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5521088/ https://www.ncbi.nlm.nih.gov/pubmed/28732468 http://dx.doi.org/10.1186/s12859-017-1764-z
work_keys_str_mv	AT yuguoxian nogoapredictingnoisygoannotationsusingevidencesandsparserepresentation AT luchang nogoapredictingnoisygoannotationsusingevidencesandsparserepresentation AT wangjun nogoapredictingnoisygoannotationsusingevidencesandsparserepresentation

NoGOA: predicting noisy GO annotations using evidences and sparse representation

Ejemplares similares