Cargando…

Impact of the Choice of Normalization Method on Molecular Cancer Class Discovery Using Nonnegative Matrix Factorization

Nonnegative Matrix Factorization (NMF) has proved to be an effective method for unsupervised clustering analysis of gene expression data. By the nonnegativity constraint, NMF provides a decomposition of the data matrix into two matrices that have been used for clustering analysis. However, the decom...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Haixuan, Seoighe, Cathal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5065197/
https://www.ncbi.nlm.nih.gov/pubmed/27741311
http://dx.doi.org/10.1371/journal.pone.0164880
_version_ 1782460287323996160
author Yang, Haixuan
Seoighe, Cathal
author_facet Yang, Haixuan
Seoighe, Cathal
author_sort Yang, Haixuan
collection PubMed
description Nonnegative Matrix Factorization (NMF) has proved to be an effective method for unsupervised clustering analysis of gene expression data. By the nonnegativity constraint, NMF provides a decomposition of the data matrix into two matrices that have been used for clustering analysis. However, the decomposition is not unique. This allows different clustering results to be obtained, resulting in different interpretations of the decomposition. To alleviate this problem, some existing methods directly enforce uniqueness to some extent by adding regularization terms in the NMF objective function. Alternatively, various normalization methods have been applied to the factor matrices; however, the effects of the choice of normalization have not been carefully investigated. Here we investigate the performance of NMF for the task of cancer class discovery, under a wide range of normalization choices. After extensive evaluations, we observe that the maximum norm showed the best performance, although the maximum norm has not previously been used for NMF. Matlab codes are freely available from: http://maths.nuigalway.ie/~haixuanyang/pNMF/pNMF.htm.
format Online
Article
Text
id pubmed-5065197
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-50651972016-10-27 Impact of the Choice of Normalization Method on Molecular Cancer Class Discovery Using Nonnegative Matrix Factorization Yang, Haixuan Seoighe, Cathal PLoS One Research Article Nonnegative Matrix Factorization (NMF) has proved to be an effective method for unsupervised clustering analysis of gene expression data. By the nonnegativity constraint, NMF provides a decomposition of the data matrix into two matrices that have been used for clustering analysis. However, the decomposition is not unique. This allows different clustering results to be obtained, resulting in different interpretations of the decomposition. To alleviate this problem, some existing methods directly enforce uniqueness to some extent by adding regularization terms in the NMF objective function. Alternatively, various normalization methods have been applied to the factor matrices; however, the effects of the choice of normalization have not been carefully investigated. Here we investigate the performance of NMF for the task of cancer class discovery, under a wide range of normalization choices. After extensive evaluations, we observe that the maximum norm showed the best performance, although the maximum norm has not previously been used for NMF. Matlab codes are freely available from: http://maths.nuigalway.ie/~haixuanyang/pNMF/pNMF.htm. Public Library of Science 2016-10-14 /pmc/articles/PMC5065197/ /pubmed/27741311 http://dx.doi.org/10.1371/journal.pone.0164880 Text en © 2016 Yang, Seoighe http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Yang, Haixuan
Seoighe, Cathal
Impact of the Choice of Normalization Method on Molecular Cancer Class Discovery Using Nonnegative Matrix Factorization
title Impact of the Choice of Normalization Method on Molecular Cancer Class Discovery Using Nonnegative Matrix Factorization
title_full Impact of the Choice of Normalization Method on Molecular Cancer Class Discovery Using Nonnegative Matrix Factorization
title_fullStr Impact of the Choice of Normalization Method on Molecular Cancer Class Discovery Using Nonnegative Matrix Factorization
title_full_unstemmed Impact of the Choice of Normalization Method on Molecular Cancer Class Discovery Using Nonnegative Matrix Factorization
title_short Impact of the Choice of Normalization Method on Molecular Cancer Class Discovery Using Nonnegative Matrix Factorization
title_sort impact of the choice of normalization method on molecular cancer class discovery using nonnegative matrix factorization
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5065197/
https://www.ncbi.nlm.nih.gov/pubmed/27741311
http://dx.doi.org/10.1371/journal.pone.0164880
work_keys_str_mv AT yanghaixuan impactofthechoiceofnormalizationmethodonmolecularcancerclassdiscoveryusingnonnegativematrixfactorization
AT seoighecathal impactofthechoiceofnormalizationmethodonmolecularcancerclassdiscoveryusingnonnegativematrixfactorization