Cargando…

Simultaneous Non-Negative Matrix Factorization for Multiple Large Scale Gene Expression Datasets in Toxicology

Non-negative matrix factorization is a useful tool for reducing the dimension of large datasets. This work considers simultaneous non-negative matrix factorization of multiple sources of data. In particular, we perform the first study that involves more than two datasets. We discuss the algorithmic...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Clare M., Mudaliar, Manikhandan A. V., Haggart, D. R., Wolf, C. Roland, Miele, Gino, Vass, J. Keith, Higham, Desmond J., Crowther, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3522745/
https://www.ncbi.nlm.nih.gov/pubmed/23272042
http://dx.doi.org/10.1371/journal.pone.0048238
_version_ 1782253127384170496
author Lee, Clare M.
Mudaliar, Manikhandan A. V.
Haggart, D. R.
Wolf, C. Roland
Miele, Gino
Vass, J. Keith
Higham, Desmond J.
Crowther, Daniel
author_facet Lee, Clare M.
Mudaliar, Manikhandan A. V.
Haggart, D. R.
Wolf, C. Roland
Miele, Gino
Vass, J. Keith
Higham, Desmond J.
Crowther, Daniel
author_sort Lee, Clare M.
collection PubMed
description Non-negative matrix factorization is a useful tool for reducing the dimension of large datasets. This work considers simultaneous non-negative matrix factorization of multiple sources of data. In particular, we perform the first study that involves more than two datasets. We discuss the algorithmic issues required to convert the approach into a practical computational tool and apply the technique to new gene expression data quantifying the molecular changes in four tissue types due to different dosages of an experimental panPPAR agonist in mouse. This study is of interest in toxicology because, whilst PPARs form potential therapeutic targets for diabetes, it is known that they can induce serious side-effects. Our results show that the practical simultaneous non-negative matrix factorization developed here can add value to the data analysis. In particular, we find that factorizing the data as a single object allows us to distinguish between the four tissue types, but does not correctly reproduce the known dosage level groups. Applying our new approach, which treats the four tissue types as providing distinct, but related, datasets, we find that the dosage level groups are respected. The new algorithm then provides separate gene list orderings that can be studied for each tissue type, and compared with the ordering arising from the single factorization. We find that many of our conclusions can be corroborated with known biological behaviour, and others offer new insights into the toxicological effects. Overall, the algorithm shows promise for early detection of toxicity in the drug discovery process.
format Online
Article
Text
id pubmed-3522745
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35227452012-12-27 Simultaneous Non-Negative Matrix Factorization for Multiple Large Scale Gene Expression Datasets in Toxicology Lee, Clare M. Mudaliar, Manikhandan A. V. Haggart, D. R. Wolf, C. Roland Miele, Gino Vass, J. Keith Higham, Desmond J. Crowther, Daniel PLoS One Research Article Non-negative matrix factorization is a useful tool for reducing the dimension of large datasets. This work considers simultaneous non-negative matrix factorization of multiple sources of data. In particular, we perform the first study that involves more than two datasets. We discuss the algorithmic issues required to convert the approach into a practical computational tool and apply the technique to new gene expression data quantifying the molecular changes in four tissue types due to different dosages of an experimental panPPAR agonist in mouse. This study is of interest in toxicology because, whilst PPARs form potential therapeutic targets for diabetes, it is known that they can induce serious side-effects. Our results show that the practical simultaneous non-negative matrix factorization developed here can add value to the data analysis. In particular, we find that factorizing the data as a single object allows us to distinguish between the four tissue types, but does not correctly reproduce the known dosage level groups. Applying our new approach, which treats the four tissue types as providing distinct, but related, datasets, we find that the dosage level groups are respected. The new algorithm then provides separate gene list orderings that can be studied for each tissue type, and compared with the ordering arising from the single factorization. We find that many of our conclusions can be corroborated with known biological behaviour, and others offer new insights into the toxicological effects. Overall, the algorithm shows promise for early detection of toxicity in the drug discovery process. Public Library of Science 2012-12-14 /pmc/articles/PMC3522745/ /pubmed/23272042 http://dx.doi.org/10.1371/journal.pone.0048238 Text en © 2012 Lee et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Lee, Clare M.
Mudaliar, Manikhandan A. V.
Haggart, D. R.
Wolf, C. Roland
Miele, Gino
Vass, J. Keith
Higham, Desmond J.
Crowther, Daniel
Simultaneous Non-Negative Matrix Factorization for Multiple Large Scale Gene Expression Datasets in Toxicology
title Simultaneous Non-Negative Matrix Factorization for Multiple Large Scale Gene Expression Datasets in Toxicology
title_full Simultaneous Non-Negative Matrix Factorization for Multiple Large Scale Gene Expression Datasets in Toxicology
title_fullStr Simultaneous Non-Negative Matrix Factorization for Multiple Large Scale Gene Expression Datasets in Toxicology
title_full_unstemmed Simultaneous Non-Negative Matrix Factorization for Multiple Large Scale Gene Expression Datasets in Toxicology
title_short Simultaneous Non-Negative Matrix Factorization for Multiple Large Scale Gene Expression Datasets in Toxicology
title_sort simultaneous non-negative matrix factorization for multiple large scale gene expression datasets in toxicology
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3522745/
https://www.ncbi.nlm.nih.gov/pubmed/23272042
http://dx.doi.org/10.1371/journal.pone.0048238
work_keys_str_mv AT leeclarem simultaneousnonnegativematrixfactorizationformultiplelargescalegeneexpressiondatasetsintoxicology
AT mudaliarmanikhandanav simultaneousnonnegativematrixfactorizationformultiplelargescalegeneexpressiondatasetsintoxicology
AT haggartdr simultaneousnonnegativematrixfactorizationformultiplelargescalegeneexpressiondatasetsintoxicology
AT wolfcroland simultaneousnonnegativematrixfactorizationformultiplelargescalegeneexpressiondatasetsintoxicology
AT mielegino simultaneousnonnegativematrixfactorizationformultiplelargescalegeneexpressiondatasetsintoxicology
AT vassjkeith simultaneousnonnegativematrixfactorizationformultiplelargescalegeneexpressiondatasetsintoxicology
AT highamdesmondj simultaneousnonnegativematrixfactorizationformultiplelargescalegeneexpressiondatasetsintoxicology
AT crowtherdaniel simultaneousnonnegativematrixfactorizationformultiplelargescalegeneexpressiondatasetsintoxicology