Cargando…

STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline

Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appe...

Descripción completa

Detalles Bibliográficos
Autores principales: Planell, Nuria, Lagani, Vincenzo, Sebastian-Leon, Patricia, van der Kloet, Frans, Ewing, Ewoud, Karathanasis, Nestoras, Urdangarin, Arantxa, Arozarena, Imanol, Jagodic, Maja, Tsamardinos, Ioannis, Tarazona, Sonia, Conesa, Ana, Tegner, Jesper, Gomez-Cabrero, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7970106/
https://www.ncbi.nlm.nih.gov/pubmed/33747045
http://dx.doi.org/10.3389/fgene.2021.620453
_version_ 1783666367865552896
author Planell, Nuria
Lagani, Vincenzo
Sebastian-Leon, Patricia
van der Kloet, Frans
Ewing, Ewoud
Karathanasis, Nestoras
Urdangarin, Arantxa
Arozarena, Imanol
Jagodic, Maja
Tsamardinos, Ioannis
Tarazona, Sonia
Conesa, Ana
Tegner, Jesper
Gomez-Cabrero, David
author_facet Planell, Nuria
Lagani, Vincenzo
Sebastian-Leon, Patricia
van der Kloet, Frans
Ewing, Ewoud
Karathanasis, Nestoras
Urdangarin, Arantxa
Arozarena, Imanol
Jagodic, Maja
Tsamardinos, Ioannis
Tarazona, Sonia
Conesa, Ana
Tegner, Jesper
Gomez-Cabrero, David
author_sort Planell, Nuria
collection PubMed
description Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decision over which tools to use and how to combine them. Therefore, it is an unmet need to conceptualize how to integrate such data and implement and validate pipelines in different cases. We have designed a conceptual framework (STATegra), aiming it to be as generic as possible for multi-omics analysis, combining available multi-omic anlaysis tools (machine learning component analysis, non-parametric data combination, and a multi-omics exploratory analysis) in a step-wise manner. While in several studies, we have previously combined those integrative tools, here, we provide a systematic description of the STATegra framework and its validation using two The Cancer Genome Atlas (TCGA) case studies. For both, the Glioblastoma and the Skin Cutaneous Melanoma (SKCM) cases, we demonstrate an enhanced capacity of the framework (and beyond the individual tools) to identify features and pathways compared to single-omics analysis. Such an integrative multi-omics analysis framework for identifying features and components facilitates the discovery of new biology. Finally, we provide several options for applying the STATegra framework when parametric assumptions are fulfilled and for the case when not all the samples are profiled for all omics. The STATegra framework is built using several tools, which are being integrated step-by-step as OpenSource in the STATegRa Bioconductor package.
format Online
Article
Text
id pubmed-7970106
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-79701062021-03-19 STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline Planell, Nuria Lagani, Vincenzo Sebastian-Leon, Patricia van der Kloet, Frans Ewing, Ewoud Karathanasis, Nestoras Urdangarin, Arantxa Arozarena, Imanol Jagodic, Maja Tsamardinos, Ioannis Tarazona, Sonia Conesa, Ana Tegner, Jesper Gomez-Cabrero, David Front Genet Genetics Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decision over which tools to use and how to combine them. Therefore, it is an unmet need to conceptualize how to integrate such data and implement and validate pipelines in different cases. We have designed a conceptual framework (STATegra), aiming it to be as generic as possible for multi-omics analysis, combining available multi-omic anlaysis tools (machine learning component analysis, non-parametric data combination, and a multi-omics exploratory analysis) in a step-wise manner. While in several studies, we have previously combined those integrative tools, here, we provide a systematic description of the STATegra framework and its validation using two The Cancer Genome Atlas (TCGA) case studies. For both, the Glioblastoma and the Skin Cutaneous Melanoma (SKCM) cases, we demonstrate an enhanced capacity of the framework (and beyond the individual tools) to identify features and pathways compared to single-omics analysis. Such an integrative multi-omics analysis framework for identifying features and components facilitates the discovery of new biology. Finally, we provide several options for applying the STATegra framework when parametric assumptions are fulfilled and for the case when not all the samples are profiled for all omics. The STATegra framework is built using several tools, which are being integrated step-by-step as OpenSource in the STATegRa Bioconductor package. Frontiers Media S.A. 2021-03-04 /pmc/articles/PMC7970106/ /pubmed/33747045 http://dx.doi.org/10.3389/fgene.2021.620453 Text en Copyright © 2021 Planell, Lagani, Sebastian-Leon, van der Kloet, Ewing, Karathanasis, Urdangarin, Arozarena, Jagodic, Tsamardinos, Tarazona, Conesa, Tegner and Gomez-Cabrero. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Planell, Nuria
Lagani, Vincenzo
Sebastian-Leon, Patricia
van der Kloet, Frans
Ewing, Ewoud
Karathanasis, Nestoras
Urdangarin, Arantxa
Arozarena, Imanol
Jagodic, Maja
Tsamardinos, Ioannis
Tarazona, Sonia
Conesa, Ana
Tegner, Jesper
Gomez-Cabrero, David
STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline
title STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline
title_full STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline
title_fullStr STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline
title_full_unstemmed STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline
title_short STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline
title_sort stategra: multi-omics data integration – a conceptual scheme with a bioinformatics pipeline
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7970106/
https://www.ncbi.nlm.nih.gov/pubmed/33747045
http://dx.doi.org/10.3389/fgene.2021.620453
work_keys_str_mv AT planellnuria stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT laganivincenzo stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT sebastianleonpatricia stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT vanderkloetfrans stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT ewingewoud stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT karathanasisnestoras stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT urdangarinarantxa stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT arozarenaimanol stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT jagodicmaja stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT tsamardinosioannis stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT tarazonasonia stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT conesaana stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT tegnerjesper stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline
AT gomezcabrerodavid stategramultiomicsdataintegrationaconceptualschemewithabioinformaticspipeline