Cargando…

Predicting gene function in a hierarchical context with an ensemble of classifiers

BACKGROUND: The wide availability of genome-scale data for several organisms has stimulated interest in computational approaches to gene function prediction. Diverse machine learning methods have been applied to unicellular organisms with some success, but few have been extensively tested on higher...

Descripción completa

Detalles Bibliográficos
Autores principales: Guan, Yuanfang, Myers, Chad L, Hess, David C, Barutcuoglu, Zafer, Caudy, Amy A, Troyanskaya, Olga G
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2447537/
https://www.ncbi.nlm.nih.gov/pubmed/18613947
http://dx.doi.org/10.1186/gb-2008-9-s1-s3
_version_ 1782156962324021248
author Guan, Yuanfang
Myers, Chad L
Hess, David C
Barutcuoglu, Zafer
Caudy, Amy A
Troyanskaya, Olga G
author_facet Guan, Yuanfang
Myers, Chad L
Hess, David C
Barutcuoglu, Zafer
Caudy, Amy A
Troyanskaya, Olga G
author_sort Guan, Yuanfang
collection PubMed
description BACKGROUND: The wide availability of genome-scale data for several organisms has stimulated interest in computational approaches to gene function prediction. Diverse machine learning methods have been applied to unicellular organisms with some success, but few have been extensively tested on higher level, multicellular organisms. A recent mouse function prediction project (MouseFunc) brought together nine bioinformatics teams applying a diverse array of methodologies to mount the first large-scale effort to predict gene function in the laboratory mouse. RESULTS: In this paper, we describe our contribution to this project, an ensemble framework based on the support vector machine that integrates diverse datasets in the context of the Gene Ontology hierarchy. We carry out a detailed analysis of the performance of our ensemble and provide insights into which methods work best under a variety of prediction scenarios. In addition, we applied our method to Saccharomyces cerevisiae and have experimentally confirmed functions for a novel mitochondrial protein. CONCLUSION: Our method consistently performs among the top methods in the MouseFunc evaluation. Furthermore, it exhibits good classification performance across a variety of cellular processes and functions in both a multicellular organism and a unicellular organism, indicating its ability to discover novel biology in diverse settings.
format Text
id pubmed-2447537
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-24475372008-07-10 Predicting gene function in a hierarchical context with an ensemble of classifiers Guan, Yuanfang Myers, Chad L Hess, David C Barutcuoglu, Zafer Caudy, Amy A Troyanskaya, Olga G Genome Biol Research BACKGROUND: The wide availability of genome-scale data for several organisms has stimulated interest in computational approaches to gene function prediction. Diverse machine learning methods have been applied to unicellular organisms with some success, but few have been extensively tested on higher level, multicellular organisms. A recent mouse function prediction project (MouseFunc) brought together nine bioinformatics teams applying a diverse array of methodologies to mount the first large-scale effort to predict gene function in the laboratory mouse. RESULTS: In this paper, we describe our contribution to this project, an ensemble framework based on the support vector machine that integrates diverse datasets in the context of the Gene Ontology hierarchy. We carry out a detailed analysis of the performance of our ensemble and provide insights into which methods work best under a variety of prediction scenarios. In addition, we applied our method to Saccharomyces cerevisiae and have experimentally confirmed functions for a novel mitochondrial protein. CONCLUSION: Our method consistently performs among the top methods in the MouseFunc evaluation. Furthermore, it exhibits good classification performance across a variety of cellular processes and functions in both a multicellular organism and a unicellular organism, indicating its ability to discover novel biology in diverse settings. BioMed Central 2008 2008-06-27 /pmc/articles/PMC2447537/ /pubmed/18613947 http://dx.doi.org/10.1186/gb-2008-9-s1-s3 Text en Copyright © 2008 Guan et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Guan, Yuanfang
Myers, Chad L
Hess, David C
Barutcuoglu, Zafer
Caudy, Amy A
Troyanskaya, Olga G
Predicting gene function in a hierarchical context with an ensemble of classifiers
title Predicting gene function in a hierarchical context with an ensemble of classifiers
title_full Predicting gene function in a hierarchical context with an ensemble of classifiers
title_fullStr Predicting gene function in a hierarchical context with an ensemble of classifiers
title_full_unstemmed Predicting gene function in a hierarchical context with an ensemble of classifiers
title_short Predicting gene function in a hierarchical context with an ensemble of classifiers
title_sort predicting gene function in a hierarchical context with an ensemble of classifiers
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2447537/
https://www.ncbi.nlm.nih.gov/pubmed/18613947
http://dx.doi.org/10.1186/gb-2008-9-s1-s3
work_keys_str_mv AT guanyuanfang predictinggenefunctioninahierarchicalcontextwithanensembleofclassifiers
AT myerschadl predictinggenefunctioninahierarchicalcontextwithanensembleofclassifiers
AT hessdavidc predictinggenefunctioninahierarchicalcontextwithanensembleofclassifiers
AT barutcuogluzafer predictinggenefunctioninahierarchicalcontextwithanensembleofclassifiers
AT caudyamya predictinggenefunctioninahierarchicalcontextwithanensembleofclassifiers
AT troyanskayaolgag predictinggenefunctioninahierarchicalcontextwithanensembleofclassifiers