Cargando…

Progress and challenges in the computational prediction of gene function using networks

In this opinion piece, we attempt to unify recent arguments we have made that serious confounds affect the use of network data to predict and characterize gene function. The development of computational approaches to determine gene function is a major strand of computational genomics research. Howev...

Descripción completa

Detalles Bibliográficos
Autores principales: Pavlidis, Paul, Gillis, Jesse
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3782350/
https://www.ncbi.nlm.nih.gov/pubmed/23936626
http://dx.doi.org/10.12688/f1000research.1-14.v1
_version_ 1782285532688023552
author Pavlidis, Paul
Gillis, Jesse
author_facet Pavlidis, Paul
Gillis, Jesse
author_sort Pavlidis, Paul
collection PubMed
description In this opinion piece, we attempt to unify recent arguments we have made that serious confounds affect the use of network data to predict and characterize gene function. The development of computational approaches to determine gene function is a major strand of computational genomics research. However, progress beyond using BLAST to transfer annotations has been surprisingly slow. We have previously argued that a large part of the reported success in using "guilt by association" in network data is due to the tendency of methods to simply assign new functions to already well-annotated genes. While such predictions will tend to be correct, they are generic; it is true, but not very helpful, that a gene with many functions is more likely to have any function. We have also presented evidence that much of the remaining performance in cross-validation cannot be usefully generalized to new predictions, making progressive improvement in analysis difficult to engineer. Here we summarize our findings about how these problems will affect network analysis, discuss some ongoing responses within the field to these issues, and consolidate some recommendations and speculation, which we hope will modestly increase the reliability and specificity of gene function prediction.
format Online
Article
Text
id pubmed-3782350
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-37823502013-12-05 Progress and challenges in the computational prediction of gene function using networks Pavlidis, Paul Gillis, Jesse F1000Res Opinion Article In this opinion piece, we attempt to unify recent arguments we have made that serious confounds affect the use of network data to predict and characterize gene function. The development of computational approaches to determine gene function is a major strand of computational genomics research. However, progress beyond using BLAST to transfer annotations has been surprisingly slow. We have previously argued that a large part of the reported success in using "guilt by association" in network data is due to the tendency of methods to simply assign new functions to already well-annotated genes. While such predictions will tend to be correct, they are generic; it is true, but not very helpful, that a gene with many functions is more likely to have any function. We have also presented evidence that much of the remaining performance in cross-validation cannot be usefully generalized to new predictions, making progressive improvement in analysis difficult to engineer. Here we summarize our findings about how these problems will affect network analysis, discuss some ongoing responses within the field to these issues, and consolidate some recommendations and speculation, which we hope will modestly increase the reliability and specificity of gene function prediction. F1000Research 2012-09-07 /pmc/articles/PMC3782350/ /pubmed/23936626 http://dx.doi.org/10.12688/f1000research.1-14.v1 Text en Copyright: © 2012 Pavlidis P et al. http://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. http://creativecommons.org/publicdomain/zero/1.0/ Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).
spellingShingle Opinion Article
Pavlidis, Paul
Gillis, Jesse
Progress and challenges in the computational prediction of gene function using networks
title Progress and challenges in the computational prediction of gene function using networks
title_full Progress and challenges in the computational prediction of gene function using networks
title_fullStr Progress and challenges in the computational prediction of gene function using networks
title_full_unstemmed Progress and challenges in the computational prediction of gene function using networks
title_short Progress and challenges in the computational prediction of gene function using networks
title_sort progress and challenges in the computational prediction of gene function using networks
topic Opinion Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3782350/
https://www.ncbi.nlm.nih.gov/pubmed/23936626
http://dx.doi.org/10.12688/f1000research.1-14.v1
work_keys_str_mv AT pavlidispaul progressandchallengesinthecomputationalpredictionofgenefunctionusingnetworks
AT gillisjesse progressandchallengesinthecomputationalpredictionofgenefunctionusingnetworks