Cargando…

A Top-Performing Algorithm for the DREAM3 Gene Expression Prediction Challenge

A wealth of computational methods has been developed to address problems in systems biology, such as modeling gene expression. However, to objectively evaluate and compare such methods is notoriously difficult. The DREAM (Dialogue on Reverse Engineering Assessments and Methods) project is a communit...

Descripción completa

Detalles Bibliográficos
Autor principal: Ruan, Jianhua
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2816205/
https://www.ncbi.nlm.nih.gov/pubmed/20140212
http://dx.doi.org/10.1371/journal.pone.0008944
_version_ 1782177076622655488
author Ruan, Jianhua
author_facet Ruan, Jianhua
author_sort Ruan, Jianhua
collection PubMed
description A wealth of computational methods has been developed to address problems in systems biology, such as modeling gene expression. However, to objectively evaluate and compare such methods is notoriously difficult. The DREAM (Dialogue on Reverse Engineering Assessments and Methods) project is a community-wide effort to assess the relative strengths and weaknesses of different computational methods for a set of core problems in systems biology. This article presents a top-performing algorithm for one of the challenge problems in the third annual DREAM (DREAM3), namely the gene expression prediction challenge. In this challenge, participants are asked to predict the expression levels of a small set of genes in a yeast deletion strain, given the expression levels of all other genes in the same strain and complete gene expression data for several other yeast strains. I propose a simple [Image: see text]-nearest-neighbor (KNN) method to solve this problem. Despite its simplicity, this method works well for this challenge, sharing the “top performer” honor with a much more sophisticated method. I also describe several alternative, simple strategies, including a modified KNN algorithm that further improves the performance of the standard KNN method. The success of these methods suggests that complex methods attempting to integrate multiple data sets do not necessarily lead to better performance than simple yet robust methods. Furthermore, none of these top-performing methods, including the one by a different team, are based on gene regulatory networks, which seems to suggest that accurately modeling gene expression using gene regulatory networks is unfortunately still a difficult task.
format Text
id pubmed-2816205
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-28162052010-02-07 A Top-Performing Algorithm for the DREAM3 Gene Expression Prediction Challenge Ruan, Jianhua PLoS One Research Article A wealth of computational methods has been developed to address problems in systems biology, such as modeling gene expression. However, to objectively evaluate and compare such methods is notoriously difficult. The DREAM (Dialogue on Reverse Engineering Assessments and Methods) project is a community-wide effort to assess the relative strengths and weaknesses of different computational methods for a set of core problems in systems biology. This article presents a top-performing algorithm for one of the challenge problems in the third annual DREAM (DREAM3), namely the gene expression prediction challenge. In this challenge, participants are asked to predict the expression levels of a small set of genes in a yeast deletion strain, given the expression levels of all other genes in the same strain and complete gene expression data for several other yeast strains. I propose a simple [Image: see text]-nearest-neighbor (KNN) method to solve this problem. Despite its simplicity, this method works well for this challenge, sharing the “top performer” honor with a much more sophisticated method. I also describe several alternative, simple strategies, including a modified KNN algorithm that further improves the performance of the standard KNN method. The success of these methods suggests that complex methods attempting to integrate multiple data sets do not necessarily lead to better performance than simple yet robust methods. Furthermore, none of these top-performing methods, including the one by a different team, are based on gene regulatory networks, which seems to suggest that accurately modeling gene expression using gene regulatory networks is unfortunately still a difficult task. Public Library of Science 2010-02-04 /pmc/articles/PMC2816205/ /pubmed/20140212 http://dx.doi.org/10.1371/journal.pone.0008944 Text en Jianhua Ruan. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Ruan, Jianhua
A Top-Performing Algorithm for the DREAM3 Gene Expression Prediction Challenge
title A Top-Performing Algorithm for the DREAM3 Gene Expression Prediction Challenge
title_full A Top-Performing Algorithm for the DREAM3 Gene Expression Prediction Challenge
title_fullStr A Top-Performing Algorithm for the DREAM3 Gene Expression Prediction Challenge
title_full_unstemmed A Top-Performing Algorithm for the DREAM3 Gene Expression Prediction Challenge
title_short A Top-Performing Algorithm for the DREAM3 Gene Expression Prediction Challenge
title_sort top-performing algorithm for the dream3 gene expression prediction challenge
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2816205/
https://www.ncbi.nlm.nih.gov/pubmed/20140212
http://dx.doi.org/10.1371/journal.pone.0008944
work_keys_str_mv AT ruanjianhua atopperformingalgorithmforthedream3geneexpressionpredictionchallenge
AT ruanjianhua topperformingalgorithmforthedream3geneexpressionpredictionchallenge