Cargando…

DeepMF: deciphering the latent patterns in omics profiles with a deep learning method

BACKGROUND: With recent advances in high-throughput technologies, matrix factorization techniques are increasingly being utilized for mapping quantitative omics profiling matrix data into low-dimensional embedding space, in the hope of uncovering insights in the underlying biological processes. Neve...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Lingxi, Xu, Jiao, Li, Shuai Cheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933662/
https://www.ncbi.nlm.nih.gov/pubmed/31881818
http://dx.doi.org/10.1186/s12859-019-3291-6
_version_ 1783483252786331648
author Chen, Lingxi
Xu, Jiao
Li, Shuai Cheng
author_facet Chen, Lingxi
Xu, Jiao
Li, Shuai Cheng
author_sort Chen, Lingxi
collection PubMed
description BACKGROUND: With recent advances in high-throughput technologies, matrix factorization techniques are increasingly being utilized for mapping quantitative omics profiling matrix data into low-dimensional embedding space, in the hope of uncovering insights in the underlying biological processes. Nevertheless, current matrix factorization tools fall short in handling noisy data and missing entries, both deficiencies that are often found in real-life data. RESULTS: Here, we propose DeepMF, a deep neural network-based factorization model. DeepMF disentangles the association between molecular feature-associated and sample-associated latent matrices, and is tolerant to noisy and missing values. It exhibited feasible cancer subtype discovery efficacy on mRNA, miRNA, and protein profiles of medulloblastoma cancer, leukemia cancer, breast cancer, and small-blue-round-cell cancer, achieving the highest clustering accuracy of 76%, 100%, 92%, and 100% respectively. When analyzing data sets with 70% missing entries, DeepMF gave the best recovery capacity with silhouette values of 0.47, 0.6, 0.28, and 0.44, outperforming other state-of-the-art MF tools on the cancer data sets Medulloblastoma, Leukemia, TCGA BRCA, and SRBCT. Its embedding strength as measured by clustering accuracy is 88%, 100%, 84%, and 96% on these data sets, which improves on the current best methods 76%, 100%, 78%, and 87%. CONCLUSION: DeepMF demonstrated robust denoising, imputation, and embedding ability. It offers insights to uncover the underlying biological processes such as cancer subtype discovery. Our implementation of DeepMF can be found at https://github.com/paprikachan/DeepMF.
format Online
Article
Text
id pubmed-6933662
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69336622019-12-30 DeepMF: deciphering the latent patterns in omics profiles with a deep learning method Chen, Lingxi Xu, Jiao Li, Shuai Cheng BMC Bioinformatics Methodology BACKGROUND: With recent advances in high-throughput technologies, matrix factorization techniques are increasingly being utilized for mapping quantitative omics profiling matrix data into low-dimensional embedding space, in the hope of uncovering insights in the underlying biological processes. Nevertheless, current matrix factorization tools fall short in handling noisy data and missing entries, both deficiencies that are often found in real-life data. RESULTS: Here, we propose DeepMF, a deep neural network-based factorization model. DeepMF disentangles the association between molecular feature-associated and sample-associated latent matrices, and is tolerant to noisy and missing values. It exhibited feasible cancer subtype discovery efficacy on mRNA, miRNA, and protein profiles of medulloblastoma cancer, leukemia cancer, breast cancer, and small-blue-round-cell cancer, achieving the highest clustering accuracy of 76%, 100%, 92%, and 100% respectively. When analyzing data sets with 70% missing entries, DeepMF gave the best recovery capacity with silhouette values of 0.47, 0.6, 0.28, and 0.44, outperforming other state-of-the-art MF tools on the cancer data sets Medulloblastoma, Leukemia, TCGA BRCA, and SRBCT. Its embedding strength as measured by clustering accuracy is 88%, 100%, 84%, and 96% on these data sets, which improves on the current best methods 76%, 100%, 78%, and 87%. CONCLUSION: DeepMF demonstrated robust denoising, imputation, and embedding ability. It offers insights to uncover the underlying biological processes such as cancer subtype discovery. Our implementation of DeepMF can be found at https://github.com/paprikachan/DeepMF. BioMed Central 2019-12-27 /pmc/articles/PMC6933662/ /pubmed/31881818 http://dx.doi.org/10.1186/s12859-019-3291-6 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Chen, Lingxi
Xu, Jiao
Li, Shuai Cheng
DeepMF: deciphering the latent patterns in omics profiles with a deep learning method
title DeepMF: deciphering the latent patterns in omics profiles with a deep learning method
title_full DeepMF: deciphering the latent patterns in omics profiles with a deep learning method
title_fullStr DeepMF: deciphering the latent patterns in omics profiles with a deep learning method
title_full_unstemmed DeepMF: deciphering the latent patterns in omics profiles with a deep learning method
title_short DeepMF: deciphering the latent patterns in omics profiles with a deep learning method
title_sort deepmf: deciphering the latent patterns in omics profiles with a deep learning method
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933662/
https://www.ncbi.nlm.nih.gov/pubmed/31881818
http://dx.doi.org/10.1186/s12859-019-3291-6
work_keys_str_mv AT chenlingxi deepmfdecipheringthelatentpatternsinomicsprofileswithadeeplearningmethod
AT xujiao deepmfdecipheringthelatentpatternsinomicsprofileswithadeeplearningmethod
AT lishuaicheng deepmfdecipheringthelatentpatternsinomicsprofileswithadeeplearningmethod