Cargando…

Knowledge-based gene expression classification via matrix factorization

Motivation: Modern machine learning methods based on matrix decomposition techniques, like independent component analysis (ICA) or non-negative matrix factorization (NMF), provide new and efficient analysis tools which are currently explored to analyze gene expression profiles. These exploratory fea...

Descripción completa

Detalles Bibliográficos
Autores principales: Schachtner, R., Lutter, D., Knollmüller, P., Tomé, A. M., Theis, F. J., Schmitz, G., Stetter, M., Vilda, P. Gómez, Lang, E. W.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2638868/
https://www.ncbi.nlm.nih.gov/pubmed/18535085
http://dx.doi.org/10.1093/bioinformatics/btn245
_version_ 1782164427517198336
author Schachtner, R.
Lutter, D.
Knollmüller, P.
Tomé, A. M.
Theis, F. J.
Schmitz, G.
Stetter, M.
Vilda, P. Gómez
Lang, E. W.
author_facet Schachtner, R.
Lutter, D.
Knollmüller, P.
Tomé, A. M.
Theis, F. J.
Schmitz, G.
Stetter, M.
Vilda, P. Gómez
Lang, E. W.
author_sort Schachtner, R.
collection PubMed
description Motivation: Modern machine learning methods based on matrix decomposition techniques, like independent component analysis (ICA) or non-negative matrix factorization (NMF), provide new and efficient analysis tools which are currently explored to analyze gene expression profiles. These exploratory feature extraction techniques yield expression modes (ICA) or metagenes (NMF). These extracted features are considered indicative of underlying regulatory processes. They can as well be applied to the classification of gene expression datasets by grouping samples into different categories for diagnostic purposes or group genes into functional categories for further investigation of related metabolic pathways and regulatory networks. Results: In this study we focus on unsupervised matrix factorization techniques and apply ICA and sparse NMF to microarray datasets. The latter monitor the gene expression levels of human peripheral blood cells during differentiation from monocytes to macrophages. We show that these tools are able to identify relevant signatures in the deduced component matrices and extract informative sets of marker genes from these gene expression profiles. The methods rely on the joint discriminative power of a set of marker genes rather than on single marker genes. With these sets of marker genes, corroborated by leave-one-out or random forest cross-validation, the datasets could easily be classified into related diagnostic categories. The latter correspond to either monocytes versus macrophages or healthy vs Niemann Pick C disease patients. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: elmar.lang@biologie.uni-regensburg.de
format Text
id pubmed-2638868
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-26388682009-02-25 Knowledge-based gene expression classification via matrix factorization Schachtner, R. Lutter, D. Knollmüller, P. Tomé, A. M. Theis, F. J. Schmitz, G. Stetter, M. Vilda, P. Gómez Lang, E. W. Bioinformatics Original Papers Motivation: Modern machine learning methods based on matrix decomposition techniques, like independent component analysis (ICA) or non-negative matrix factorization (NMF), provide new and efficient analysis tools which are currently explored to analyze gene expression profiles. These exploratory feature extraction techniques yield expression modes (ICA) or metagenes (NMF). These extracted features are considered indicative of underlying regulatory processes. They can as well be applied to the classification of gene expression datasets by grouping samples into different categories for diagnostic purposes or group genes into functional categories for further investigation of related metabolic pathways and regulatory networks. Results: In this study we focus on unsupervised matrix factorization techniques and apply ICA and sparse NMF to microarray datasets. The latter monitor the gene expression levels of human peripheral blood cells during differentiation from monocytes to macrophages. We show that these tools are able to identify relevant signatures in the deduced component matrices and extract informative sets of marker genes from these gene expression profiles. The methods rely on the joint discriminative power of a set of marker genes rather than on single marker genes. With these sets of marker genes, corroborated by leave-one-out or random forest cross-validation, the datasets could easily be classified into related diagnostic categories. The latter correspond to either monocytes versus macrophages or healthy vs Niemann Pick C disease patients. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: elmar.lang@biologie.uni-regensburg.de Oxford University Press 2008-08-01 2008-06-05 /pmc/articles/PMC2638868/ /pubmed/18535085 http://dx.doi.org/10.1093/bioinformatics/btn245 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Schachtner, R.
Lutter, D.
Knollmüller, P.
Tomé, A. M.
Theis, F. J.
Schmitz, G.
Stetter, M.
Vilda, P. Gómez
Lang, E. W.
Knowledge-based gene expression classification via matrix factorization
title Knowledge-based gene expression classification via matrix factorization
title_full Knowledge-based gene expression classification via matrix factorization
title_fullStr Knowledge-based gene expression classification via matrix factorization
title_full_unstemmed Knowledge-based gene expression classification via matrix factorization
title_short Knowledge-based gene expression classification via matrix factorization
title_sort knowledge-based gene expression classification via matrix factorization
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2638868/
https://www.ncbi.nlm.nih.gov/pubmed/18535085
http://dx.doi.org/10.1093/bioinformatics/btn245
work_keys_str_mv AT schachtnerr knowledgebasedgeneexpressionclassificationviamatrixfactorization
AT lutterd knowledgebasedgeneexpressionclassificationviamatrixfactorization
AT knollmullerp knowledgebasedgeneexpressionclassificationviamatrixfactorization
AT tomeam knowledgebasedgeneexpressionclassificationviamatrixfactorization
AT theisfj knowledgebasedgeneexpressionclassificationviamatrixfactorization
AT schmitzg knowledgebasedgeneexpressionclassificationviamatrixfactorization
AT stetterm knowledgebasedgeneexpressionclassificationviamatrixfactorization
AT vildapgomez knowledgebasedgeneexpressionclassificationviamatrixfactorization
AT langew knowledgebasedgeneexpressionclassificationviamatrixfactorization