Cargando…
Discriminative local subspaces in gene expression data for effective gene function prediction
Motivation: Massive amounts of genome-wide gene expression data have become available, motivating the development of computational approaches that leverage this information to predict gene function. Among successful approaches, supervised machine learning methods, such as Support Vector Machines (SV...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426849/ https://www.ncbi.nlm.nih.gov/pubmed/22820203 http://dx.doi.org/10.1093/bioinformatics/bts455 |
_version_ | 1782241550363787264 |
---|---|
author | Puelma, Tomas Gutiérrez, Rodrigo A. Soto, Alvaro |
author_facet | Puelma, Tomas Gutiérrez, Rodrigo A. Soto, Alvaro |
author_sort | Puelma, Tomas |
collection | PubMed |
description | Motivation: Massive amounts of genome-wide gene expression data have become available, motivating the development of computational approaches that leverage this information to predict gene function. Among successful approaches, supervised machine learning methods, such as Support Vector Machines (SVMs), have shown superior prediction accuracy. However, these methods lack the simple biological intuition provided by co-expression networks (CNs), limiting their practical usefulness. Results: In this work, we present Discriminative Local Subspaces (DLS), a novel method that combines supervised machine learning and co-expression techniques with the goal of systematically predict genes involved in specific biological processes of interest. Unlike traditional CNs, DLS uses the knowledge available in Gene Ontology (GO) to generate informative training sets that guide the discovery of expression signatures: expression patterns that are discriminative for genes involved in the biological process of interest. By linking genes co-expressed with these signatures, DLS is able to construct a discriminative CN that links both, known and previously uncharacterized genes, for the selected biological process. This article focuses on the algorithm behind DLS and shows its predictive power using an Arabidopsis thaliana dataset and a representative set of 101 GO terms from the Biological Process Ontology. Our results show that DLS has a superior average accuracy than both SVMs and CNs. Thus, DLS is able to provide the prediction accuracy of supervised learning methods while maintaining the intuitive understanding of CNs. Availability: A MATLAB® implementation of DLS is available at http://virtualplant.bio.puc.cl/cgi-bin/Lab/tools.cgi Contact: tfpuelma@uc.cl Supplementary Information: Supplementary data are available at http://bioinformatics.mpimp-golm.mpg.de/. |
format | Online Article Text |
id | pubmed-3426849 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-34268492013-09-01 Discriminative local subspaces in gene expression data for effective gene function prediction Puelma, Tomas Gutiérrez, Rodrigo A. Soto, Alvaro Bioinformatics Original Papers Motivation: Massive amounts of genome-wide gene expression data have become available, motivating the development of computational approaches that leverage this information to predict gene function. Among successful approaches, supervised machine learning methods, such as Support Vector Machines (SVMs), have shown superior prediction accuracy. However, these methods lack the simple biological intuition provided by co-expression networks (CNs), limiting their practical usefulness. Results: In this work, we present Discriminative Local Subspaces (DLS), a novel method that combines supervised machine learning and co-expression techniques with the goal of systematically predict genes involved in specific biological processes of interest. Unlike traditional CNs, DLS uses the knowledge available in Gene Ontology (GO) to generate informative training sets that guide the discovery of expression signatures: expression patterns that are discriminative for genes involved in the biological process of interest. By linking genes co-expressed with these signatures, DLS is able to construct a discriminative CN that links both, known and previously uncharacterized genes, for the selected biological process. This article focuses on the algorithm behind DLS and shows its predictive power using an Arabidopsis thaliana dataset and a representative set of 101 GO terms from the Biological Process Ontology. Our results show that DLS has a superior average accuracy than both SVMs and CNs. Thus, DLS is able to provide the prediction accuracy of supervised learning methods while maintaining the intuitive understanding of CNs. Availability: A MATLAB® implementation of DLS is available at http://virtualplant.bio.puc.cl/cgi-bin/Lab/tools.cgi Contact: tfpuelma@uc.cl Supplementary Information: Supplementary data are available at http://bioinformatics.mpimp-golm.mpg.de/. Oxford University Press 2012-09-01 2012-07-20 /pmc/articles/PMC3426849/ /pubmed/22820203 http://dx.doi.org/10.1093/bioinformatics/bts455 Text en © The Author 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.. |
spellingShingle | Original Papers Puelma, Tomas Gutiérrez, Rodrigo A. Soto, Alvaro Discriminative local subspaces in gene expression data for effective gene function prediction |
title | Discriminative local subspaces in gene expression data for effective gene function prediction |
title_full | Discriminative local subspaces in gene expression data for effective gene function prediction |
title_fullStr | Discriminative local subspaces in gene expression data for effective gene function prediction |
title_full_unstemmed | Discriminative local subspaces in gene expression data for effective gene function prediction |
title_short | Discriminative local subspaces in gene expression data for effective gene function prediction |
title_sort | discriminative local subspaces in gene expression data for effective gene function prediction |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426849/ https://www.ncbi.nlm.nih.gov/pubmed/22820203 http://dx.doi.org/10.1093/bioinformatics/bts455 |
work_keys_str_mv | AT puelmatomas discriminativelocalsubspacesingeneexpressiondataforeffectivegenefunctionprediction AT gutierrezrodrigoa discriminativelocalsubspacesingeneexpressiondataforeffectivegenefunctionprediction AT sotoalvaro discriminativelocalsubspacesingeneexpressiondataforeffectivegenefunctionprediction |