Cargando…

Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization

High-throughput biological technologies (e.g. ChIP-seq, RNA-seq and single-cell RNA-seq) rapidly accelerate the accumulation of genome-wide omics data in diverse interrelated biological scenarios (e.g. cells, tissues and conditions). Integration and differential analysis are two common paradigms for...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Lihua, Zhang, Shihua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6649783/
https://www.ncbi.nlm.nih.gov/pubmed/31175825
http://dx.doi.org/10.1093/nar/gkz488
_version_ 1783438050610642944
author Zhang, Lihua
Zhang, Shihua
author_facet Zhang, Lihua
Zhang, Shihua
author_sort Zhang, Lihua
collection PubMed
description High-throughput biological technologies (e.g. ChIP-seq, RNA-seq and single-cell RNA-seq) rapidly accelerate the accumulation of genome-wide omics data in diverse interrelated biological scenarios (e.g. cells, tissues and conditions). Integration and differential analysis are two common paradigms for exploring and analyzing such data. However, current integrative methods usually ignore the differential part, and typical differential analysis methods either fail to identify combinatorial patterns of difference or require matched dimensions of the data. Here, we propose a flexible framework CSMF to combine them into one paradigm to simultaneously reveal Common and Specific patterns via Matrix Factorization from data generated under interrelated biological scenarios. We demonstrate the effectiveness of CSMF with four representative applications including pairwise ChIP-seq data describing the chromatin modification map between K562 and Huvec cell lines; pairwise RNA-seq data representing the expression profiles of two different cancers; RNA-seq data of three breast cancer subtypes; and single-cell RNA-seq data of human embryonic stem cell differentiation at six time points. Extensive analysis yields novel insights into hidden combinatorial patterns in these multi-modal data. Results demonstrate that CSMF is a powerful tool to uncover common and specific patterns with significant biological implications from data of interrelated biological scenarios.
format Online
Article
Text
id pubmed-6649783
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-66497832019-07-29 Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization Zhang, Lihua Zhang, Shihua Nucleic Acids Res Computational Biology High-throughput biological technologies (e.g. ChIP-seq, RNA-seq and single-cell RNA-seq) rapidly accelerate the accumulation of genome-wide omics data in diverse interrelated biological scenarios (e.g. cells, tissues and conditions). Integration and differential analysis are two common paradigms for exploring and analyzing such data. However, current integrative methods usually ignore the differential part, and typical differential analysis methods either fail to identify combinatorial patterns of difference or require matched dimensions of the data. Here, we propose a flexible framework CSMF to combine them into one paradigm to simultaneously reveal Common and Specific patterns via Matrix Factorization from data generated under interrelated biological scenarios. We demonstrate the effectiveness of CSMF with four representative applications including pairwise ChIP-seq data describing the chromatin modification map between K562 and Huvec cell lines; pairwise RNA-seq data representing the expression profiles of two different cancers; RNA-seq data of three breast cancer subtypes; and single-cell RNA-seq data of human embryonic stem cell differentiation at six time points. Extensive analysis yields novel insights into hidden combinatorial patterns in these multi-modal data. Results demonstrate that CSMF is a powerful tool to uncover common and specific patterns with significant biological implications from data of interrelated biological scenarios. Oxford University Press 2019-07-26 2019-06-08 /pmc/articles/PMC6649783/ /pubmed/31175825 http://dx.doi.org/10.1093/nar/gkz488 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Computational Biology
Zhang, Lihua
Zhang, Shihua
Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization
title Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization
title_full Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization
title_fullStr Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization
title_full_unstemmed Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization
title_short Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization
title_sort learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6649783/
https://www.ncbi.nlm.nih.gov/pubmed/31175825
http://dx.doi.org/10.1093/nar/gkz488
work_keys_str_mv AT zhanglihua learningcommonandspecificpatternsfromdataofmultipleinterrelatedbiologicalscenarioswithmatrixfactorization
AT zhangshihua learningcommonandspecificpatternsfromdataofmultipleinterrelatedbiologicalscenarioswithmatrixfactorization