Cargando…

Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets

Multi‐omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi‐Omics Factor Analysis (MOFA), a computational method for discovering the...

Descripción completa

Detalles Bibliográficos
Autores principales: Argelaguet, Ricard, Velten, Britta, Arnol, Damien, Dietrich, Sascha, Zenz, Thorsten, Marioni, John C, Buettner, Florian, Huber, Wolfgang, Stegle, Oliver
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6010767/
https://www.ncbi.nlm.nih.gov/pubmed/29925568
http://dx.doi.org/10.15252/msb.20178124
_version_ 1783333654962896896
author Argelaguet, Ricard
Velten, Britta
Arnol, Damien
Dietrich, Sascha
Zenz, Thorsten
Marioni, John C
Buettner, Florian
Huber, Wolfgang
Stegle, Oliver
author_facet Argelaguet, Ricard
Velten, Britta
Arnol, Damien
Dietrich, Sascha
Zenz, Thorsten
Marioni, John C
Buettner, Florian
Huber, Wolfgang
Stegle, Oliver
author_sort Argelaguet, Ricard
collection PubMed
description Multi‐omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi‐Omics Factor Analysis (MOFA), a computational method for discovering the principal sources of variation in multi‐omics data sets. MOFA infers a set of (hidden) factors that capture biological and technical sources of variability. It disentangles axes of heterogeneity that are shared across multiple modalities and those specific to individual data modalities. The learnt factors enable a variety of downstream analyses, including identification of sample subgroups, data imputation and the detection of outlier samples. We applied MOFA to a cohort of 200 patient samples of chronic lymphocytic leukaemia, profiled for somatic mutations, RNA expression, DNA methylation and ex vivo drug responses. MOFA identified major dimensions of disease heterogeneity, including immunoglobulin heavy‐chain variable region status, trisomy of chromosome 12 and previously underappreciated drivers, such as response to oxidative stress. In a second application, we used MOFA to analyse single‐cell multi‐omics data, identifying coordinated transcriptional and epigenetic changes along cell differentiation.
format Online
Article
Text
id pubmed-6010767
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-60107672018-06-27 Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets Argelaguet, Ricard Velten, Britta Arnol, Damien Dietrich, Sascha Zenz, Thorsten Marioni, John C Buettner, Florian Huber, Wolfgang Stegle, Oliver Mol Syst Biol Methods Multi‐omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi‐Omics Factor Analysis (MOFA), a computational method for discovering the principal sources of variation in multi‐omics data sets. MOFA infers a set of (hidden) factors that capture biological and technical sources of variability. It disentangles axes of heterogeneity that are shared across multiple modalities and those specific to individual data modalities. The learnt factors enable a variety of downstream analyses, including identification of sample subgroups, data imputation and the detection of outlier samples. We applied MOFA to a cohort of 200 patient samples of chronic lymphocytic leukaemia, profiled for somatic mutations, RNA expression, DNA methylation and ex vivo drug responses. MOFA identified major dimensions of disease heterogeneity, including immunoglobulin heavy‐chain variable region status, trisomy of chromosome 12 and previously underappreciated drivers, such as response to oxidative stress. In a second application, we used MOFA to analyse single‐cell multi‐omics data, identifying coordinated transcriptional and epigenetic changes along cell differentiation. John Wiley and Sons Inc. 2018-06-20 /pmc/articles/PMC6010767/ /pubmed/29925568 http://dx.doi.org/10.15252/msb.20178124 Text en © 2018 The Authors. Published under the terms of the CC BY 4.0 license This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Argelaguet, Ricard
Velten, Britta
Arnol, Damien
Dietrich, Sascha
Zenz, Thorsten
Marioni, John C
Buettner, Florian
Huber, Wolfgang
Stegle, Oliver
Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title_full Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title_fullStr Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title_full_unstemmed Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title_short Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets
title_sort multi‐omics factor analysis—a framework for unsupervised integration of multi‐omics data sets
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6010767/
https://www.ncbi.nlm.nih.gov/pubmed/29925568
http://dx.doi.org/10.15252/msb.20178124
work_keys_str_mv AT argelaguetricard multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT veltenbritta multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT arnoldamien multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT dietrichsascha multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT zenzthorsten multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT marionijohnc multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT buettnerflorian multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT huberwolfgang multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets
AT stegleoliver multiomicsfactoranalysisaframeworkforunsupervisedintegrationofmultiomicsdatasets