Cargando…

Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes

Non-negative matrix factorization (NMF) is a relatively new approach to analyze gene expression data that models data by additive combinations of non-negative basis vectors (metagenes). The non-negativity constraint makes sense biologically as genes may either be expressed or not, but never show neg...

Descripción completa

Detalles Bibliográficos
Autores principales: Frigyesi, Attila, Höglund, Mattias
Formato: Texto
Lenguaje:English
Publicado: Libertas Academica 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2623306/
https://www.ncbi.nlm.nih.gov/pubmed/19259414
_version_ 1782163424859389952
author Frigyesi, Attila
Höglund, Mattias
author_facet Frigyesi, Attila
Höglund, Mattias
author_sort Frigyesi, Attila
collection PubMed
description Non-negative matrix factorization (NMF) is a relatively new approach to analyze gene expression data that models data by additive combinations of non-negative basis vectors (metagenes). The non-negativity constraint makes sense biologically as genes may either be expressed or not, but never show negative expression. We applied NMF to five different microarray data sets. We estimated the appropriate number metagens by comparing the residual error of NMF reconstruction of data to that of NMF reconstruction of permutated data, thus finding when a given solution contained more information than noise. This analysis also revealed that NMF could not factorize one of the data sets in a meaningful way. We used GO categories and pre defined gene sets to evaluate the biological significance of the obtained metagenes. By analyses of metagenes specific for the same GO-categories we could show that individual metagenes activated different aspects of the same biological processes. Several of the obtained metagenes correlated with tumor subtypes and tumors with characteristic chromosomal translocations, indicating that metagenes may correspond to specific disease entities. Hence, NMF extracts biological relevant structures of microarray expression data and may thus contribute to a deeper understanding of tumor behavior.
format Text
id pubmed-2623306
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-26233062009-02-24 Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes Frigyesi, Attila Höglund, Mattias Cancer Inform Original Article Non-negative matrix factorization (NMF) is a relatively new approach to analyze gene expression data that models data by additive combinations of non-negative basis vectors (metagenes). The non-negativity constraint makes sense biologically as genes may either be expressed or not, but never show negative expression. We applied NMF to five different microarray data sets. We estimated the appropriate number metagens by comparing the residual error of NMF reconstruction of data to that of NMF reconstruction of permutated data, thus finding when a given solution contained more information than noise. This analysis also revealed that NMF could not factorize one of the data sets in a meaningful way. We used GO categories and pre defined gene sets to evaluate the biological significance of the obtained metagenes. By analyses of metagenes specific for the same GO-categories we could show that individual metagenes activated different aspects of the same biological processes. Several of the obtained metagenes correlated with tumor subtypes and tumors with characteristic chromosomal translocations, indicating that metagenes may correspond to specific disease entities. Hence, NMF extracts biological relevant structures of microarray expression data and may thus contribute to a deeper understanding of tumor behavior. Libertas Academica 2008-05-29 /pmc/articles/PMC2623306/ /pubmed/19259414 Text en © 2008 by the authors http://creativecommons.org/licenses/by/3.0 This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Original Article
Frigyesi, Attila
Höglund, Mattias
Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes
title Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes
title_full Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes
title_fullStr Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes
title_full_unstemmed Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes
title_short Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes
title_sort non-negative matrix factorization for the analysis of complex gene expression data: identification of clinically relevant tumor subtypes
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2623306/
https://www.ncbi.nlm.nih.gov/pubmed/19259414
work_keys_str_mv AT frigyesiattila nonnegativematrixfactorizationfortheanalysisofcomplexgeneexpressiondataidentificationofclinicallyrelevanttumorsubtypes
AT hoglundmattias nonnegativematrixfactorizationfortheanalysisofcomplexgeneexpressiondataidentificationofclinicallyrelevanttumorsubtypes