Cargando…
Progress and challenges in the computational prediction of gene function using networks
In this opinion piece, we attempt to unify recent arguments we have made that serious confounds affect the use of network data to predict and characterize gene function. The development of computational approaches to determine gene function is a major strand of computational genomics research. Howev...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000Research
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3782350/ https://www.ncbi.nlm.nih.gov/pubmed/23936626 http://dx.doi.org/10.12688/f1000research.1-14.v1 |
_version_ | 1782285532688023552 |
---|---|
author | Pavlidis, Paul Gillis, Jesse |
author_facet | Pavlidis, Paul Gillis, Jesse |
author_sort | Pavlidis, Paul |
collection | PubMed |
description | In this opinion piece, we attempt to unify recent arguments we have made that serious confounds affect the use of network data to predict and characterize gene function. The development of computational approaches to determine gene function is a major strand of computational genomics research. However, progress beyond using BLAST to transfer annotations has been surprisingly slow. We have previously argued that a large part of the reported success in using "guilt by association" in network data is due to the tendency of methods to simply assign new functions to already well-annotated genes. While such predictions will tend to be correct, they are generic; it is true, but not very helpful, that a gene with many functions is more likely to have any function. We have also presented evidence that much of the remaining performance in cross-validation cannot be usefully generalized to new predictions, making progressive improvement in analysis difficult to engineer. Here we summarize our findings about how these problems will affect network analysis, discuss some ongoing responses within the field to these issues, and consolidate some recommendations and speculation, which we hope will modestly increase the reliability and specificity of gene function prediction. |
format | Online Article Text |
id | pubmed-3782350 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | F1000Research |
record_format | MEDLINE/PubMed |
spelling | pubmed-37823502013-12-05 Progress and challenges in the computational prediction of gene function using networks Pavlidis, Paul Gillis, Jesse F1000Res Opinion Article In this opinion piece, we attempt to unify recent arguments we have made that serious confounds affect the use of network data to predict and characterize gene function. The development of computational approaches to determine gene function is a major strand of computational genomics research. However, progress beyond using BLAST to transfer annotations has been surprisingly slow. We have previously argued that a large part of the reported success in using "guilt by association" in network data is due to the tendency of methods to simply assign new functions to already well-annotated genes. While such predictions will tend to be correct, they are generic; it is true, but not very helpful, that a gene with many functions is more likely to have any function. We have also presented evidence that much of the remaining performance in cross-validation cannot be usefully generalized to new predictions, making progressive improvement in analysis difficult to engineer. Here we summarize our findings about how these problems will affect network analysis, discuss some ongoing responses within the field to these issues, and consolidate some recommendations and speculation, which we hope will modestly increase the reliability and specificity of gene function prediction. F1000Research 2012-09-07 /pmc/articles/PMC3782350/ /pubmed/23936626 http://dx.doi.org/10.12688/f1000research.1-14.v1 Text en Copyright: © 2012 Pavlidis P et al. http://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. http://creativecommons.org/publicdomain/zero/1.0/ Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication). |
spellingShingle | Opinion Article Pavlidis, Paul Gillis, Jesse Progress and challenges in the computational prediction of gene function using networks |
title | Progress and challenges in the computational prediction of gene function using networks |
title_full | Progress and challenges in the computational prediction of gene function using networks |
title_fullStr | Progress and challenges in the computational prediction of gene function using networks |
title_full_unstemmed | Progress and challenges in the computational prediction of gene function using networks |
title_short | Progress and challenges in the computational prediction of gene function using networks |
title_sort | progress and challenges in the computational prediction of gene function using networks |
topic | Opinion Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3782350/ https://www.ncbi.nlm.nih.gov/pubmed/23936626 http://dx.doi.org/10.12688/f1000research.1-14.v1 |
work_keys_str_mv | AT pavlidispaul progressandchallengesinthecomputationalpredictionofgenefunctionusingnetworks AT gillisjesse progressandchallengesinthecomputationalpredictionofgenefunctionusingnetworks |