Cargando…

Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling

Recent technological advances and international efforts, such as The Cancer Genome Atlas (TCGA), have made available several pan-cancer datasets encompassing multiple omics layers with detailed clinical information in large collection of samples. The need has thus arisen for the development of compu...

Descripción completa

Detalles Bibliográficos
Autores principales: Chierici, Marco, Bussola, Nicole, Marcolini, Alessia, Francescatto, Margherita, Zandonà, Alessandro, Trastulla, Lucia, Agostinelli, Claudio, Jurman, Giuseppe, Furlanello, Cesare
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7340129/
https://www.ncbi.nlm.nih.gov/pubmed/32714870
http://dx.doi.org/10.3389/fonc.2020.01065
_version_ 1783554994678530048
author Chierici, Marco
Bussola, Nicole
Marcolini, Alessia
Francescatto, Margherita
Zandonà, Alessandro
Trastulla, Lucia
Agostinelli, Claudio
Jurman, Giuseppe
Furlanello, Cesare
author_facet Chierici, Marco
Bussola, Nicole
Marcolini, Alessia
Francescatto, Margherita
Zandonà, Alessandro
Trastulla, Lucia
Agostinelli, Claudio
Jurman, Giuseppe
Furlanello, Cesare
author_sort Chierici, Marco
collection PubMed
description Recent technological advances and international efforts, such as The Cancer Genome Atlas (TCGA), have made available several pan-cancer datasets encompassing multiple omics layers with detailed clinical information in large collection of samples. The need has thus arisen for the development of computational methods aimed at improving cancer subtyping and biomarker identification from multi-modal data. Here we apply the Integrative Network Fusion (INF) pipeline, which combines multiple omics layers exploiting Similarity Network Fusion (SNF) within a machine learning predictive framework. INF includes a feature ranking scheme (rSNF) on SNF-integrated features, used by a classifier over juxtaposed multi-omics features (juXT). In particular, we show instances of INF implementing Random Forest (RF) and linear Support Vector Machine (LSVM) as the classifier, and two baseline RF and LSVM models are also trained on juXT. A compact RF model, called rSNFi, trained on the intersection of top-ranked biomarkers from the two approaches juXT and rSNF is finally derived. All the classifiers are run in a 10x5-fold cross-validation schema to warrant reproducibility, following the guidelines for an unbiased Data Analysis Plan by the US FDA-led initiatives MAQC/SEQC. INF is demonstrated on four classification tasks on three multi-modal TCGA oncogenomics datasets. Gene expression, protein expression and copy number variants are used to predict estrogen receptor status (BRCA-ER, N = 381) and breast invasive carcinoma subtypes (BRCA-subtypes, N = 305), while gene expression, miRNA expression and methylation data is used as predictor layers for acute myeloid leukemia and renal clear cell carcinoma survival (AML-OS, N = 157; KIRC-OS, N = 181). In test, INF achieved similar Matthews Correlation Coefficient (MCC) values and 97% to 83% smaller feature sizes (FS), compared with juXT for BRCA-ER (MCC: 0.83 vs. 0.80; FS: 56 vs. 1801) and BRCA-subtypes (0.84 vs. 0.80; 302 vs. 1801), improving KIRC-OS performance (0.38 vs. 0.31; 111 vs. 2319). INF predictions are generally more accurate in test than one-dimensional omics models, with smaller signatures too, where transcriptomics consistently play the leading role. Overall, the INF framework effectively integrates multiple data levels in oncogenomics classification tasks, improving over the performance of single layers alone and naive juxtaposition, and provides compact signature sizes.
format Online
Article
Text
id pubmed-7340129
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-73401292020-07-23 Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling Chierici, Marco Bussola, Nicole Marcolini, Alessia Francescatto, Margherita Zandonà, Alessandro Trastulla, Lucia Agostinelli, Claudio Jurman, Giuseppe Furlanello, Cesare Front Oncol Oncology Recent technological advances and international efforts, such as The Cancer Genome Atlas (TCGA), have made available several pan-cancer datasets encompassing multiple omics layers with detailed clinical information in large collection of samples. The need has thus arisen for the development of computational methods aimed at improving cancer subtyping and biomarker identification from multi-modal data. Here we apply the Integrative Network Fusion (INF) pipeline, which combines multiple omics layers exploiting Similarity Network Fusion (SNF) within a machine learning predictive framework. INF includes a feature ranking scheme (rSNF) on SNF-integrated features, used by a classifier over juxtaposed multi-omics features (juXT). In particular, we show instances of INF implementing Random Forest (RF) and linear Support Vector Machine (LSVM) as the classifier, and two baseline RF and LSVM models are also trained on juXT. A compact RF model, called rSNFi, trained on the intersection of top-ranked biomarkers from the two approaches juXT and rSNF is finally derived. All the classifiers are run in a 10x5-fold cross-validation schema to warrant reproducibility, following the guidelines for an unbiased Data Analysis Plan by the US FDA-led initiatives MAQC/SEQC. INF is demonstrated on four classification tasks on three multi-modal TCGA oncogenomics datasets. Gene expression, protein expression and copy number variants are used to predict estrogen receptor status (BRCA-ER, N = 381) and breast invasive carcinoma subtypes (BRCA-subtypes, N = 305), while gene expression, miRNA expression and methylation data is used as predictor layers for acute myeloid leukemia and renal clear cell carcinoma survival (AML-OS, N = 157; KIRC-OS, N = 181). In test, INF achieved similar Matthews Correlation Coefficient (MCC) values and 97% to 83% smaller feature sizes (FS), compared with juXT for BRCA-ER (MCC: 0.83 vs. 0.80; FS: 56 vs. 1801) and BRCA-subtypes (0.84 vs. 0.80; 302 vs. 1801), improving KIRC-OS performance (0.38 vs. 0.31; 111 vs. 2319). INF predictions are generally more accurate in test than one-dimensional omics models, with smaller signatures too, where transcriptomics consistently play the leading role. Overall, the INF framework effectively integrates multiple data levels in oncogenomics classification tasks, improving over the performance of single layers alone and naive juxtaposition, and provides compact signature sizes. Frontiers Media S.A. 2020-06-30 /pmc/articles/PMC7340129/ /pubmed/32714870 http://dx.doi.org/10.3389/fonc.2020.01065 Text en Copyright © 2020 Chierici, Bussola, Marcolini, Francescatto, Zandonà, Trastulla, Agostinelli, Jurman and Furlanello. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Oncology
Chierici, Marco
Bussola, Nicole
Marcolini, Alessia
Francescatto, Margherita
Zandonà, Alessandro
Trastulla, Lucia
Agostinelli, Claudio
Jurman, Giuseppe
Furlanello, Cesare
Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling
title Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling
title_full Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling
title_fullStr Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling
title_full_unstemmed Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling
title_short Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling
title_sort integrative network fusion: a multi-omics approach in molecular profiling
topic Oncology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7340129/
https://www.ncbi.nlm.nih.gov/pubmed/32714870
http://dx.doi.org/10.3389/fonc.2020.01065
work_keys_str_mv AT chiericimarco integrativenetworkfusionamultiomicsapproachinmolecularprofiling
AT bussolanicole integrativenetworkfusionamultiomicsapproachinmolecularprofiling
AT marcolinialessia integrativenetworkfusionamultiomicsapproachinmolecularprofiling
AT francescattomargherita integrativenetworkfusionamultiomicsapproachinmolecularprofiling
AT zandonaalessandro integrativenetworkfusionamultiomicsapproachinmolecularprofiling
AT trastullalucia integrativenetworkfusionamultiomicsapproachinmolecularprofiling
AT agostinelliclaudio integrativenetworkfusionamultiomicsapproachinmolecularprofiling
AT jurmangiuseppe integrativenetworkfusionamultiomicsapproachinmolecularprofiling
AT furlanellocesare integrativenetworkfusionamultiomicsapproachinmolecularprofiling