Cargando…

CANTARE: finding and visualizing network-based multi-omic predictive models

BACKGROUND: One goal of multi-omic studies is to identify interpretable predictive models for outcomes of interest, with analytes drawn from multiple omes. Such findings could support refined biological insight and hypothesis generation. However, standard analytical approaches are not designed to be...

Descripción completa

Detalles Bibliográficos
Autores principales: Siebert, Janet C., Saint-Cyr, Martine, Borengasser, Sarah J., Wagner, Brandie D., Lozupone, Catherine A., Görg, Carsten
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7896366/
https://www.ncbi.nlm.nih.gov/pubmed/33607938
http://dx.doi.org/10.1186/s12859-021-04016-8
_version_ 1783653529106251776
author Siebert, Janet C.
Saint-Cyr, Martine
Borengasser, Sarah J.
Wagner, Brandie D.
Lozupone, Catherine A.
Görg, Carsten
author_facet Siebert, Janet C.
Saint-Cyr, Martine
Borengasser, Sarah J.
Wagner, Brandie D.
Lozupone, Catherine A.
Görg, Carsten
author_sort Siebert, Janet C.
collection PubMed
description BACKGROUND: One goal of multi-omic studies is to identify interpretable predictive models for outcomes of interest, with analytes drawn from multiple omes. Such findings could support refined biological insight and hypothesis generation. However, standard analytical approaches are not designed to be “ome aware.” Thus, some researchers analyze data from one ome at a time, and then combine predictions across omes. Others resort to correlation studies, cataloging pairwise relationships, but lacking an obvious approach for cohesive and interpretable summaries of these catalogs. METHODS: We present a novel workflow for building predictive regression models from network neighborhoods in multi-omic networks. First, we generate pairwise regression models across all pairs of analytes from all omes, encoding the resulting “top table” of relationships in a network. Then, we build predictive logistic regression models using the analytes in network neighborhoods of interest. We call this method CANTARE (Consolidated Analysis of Network Topology And Regression Elements). RESULTS: We applied CANTARE to previously published data from healthy controls and patients with inflammatory bowel disease (IBD) consisting of three omes: gut microbiome, metabolomics, and microbial-derived enzymes. We identified 8 unique predictive models with AUC > 0.90. The number of predictors in these models ranged from 3 to 13. We compare the results of CANTARE to random forests and elastic-net penalized regressions, analyzing AUC, predictions, and predictors. CANTARE AUC values were competitive with those generated by random forests and  penalized regressions. The top 3 CANTARE models had a greater dynamic range of predicted probabilities than did random forests and penalized regressions (p-value = 1.35 × 10(–5)). CANTARE models were significantly more likely to prioritize predictors from multiple omes than were the alternatives (p-value = 0.005). We also showed that predictive models from a network based on pairwise models with an interaction term for IBD have higher AUC than predictive models built from a correlation network (p-value = 0.016). R scripts and a CANTARE User’s Guide are available at https://sourceforge.net/projects/cytomelodics/files/CANTARE/. CONCLUSION: CANTARE offers a flexible approach for building parsimonious, interpretable multi-omic models. These models yield quantitative and directional effect sizes for predictors and support the generation of hypotheses for follow-up investigation.
format Online
Article
Text
id pubmed-7896366
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-78963662021-02-22 CANTARE: finding and visualizing network-based multi-omic predictive models Siebert, Janet C. Saint-Cyr, Martine Borengasser, Sarah J. Wagner, Brandie D. Lozupone, Catherine A. Görg, Carsten BMC Bioinformatics Research Article BACKGROUND: One goal of multi-omic studies is to identify interpretable predictive models for outcomes of interest, with analytes drawn from multiple omes. Such findings could support refined biological insight and hypothesis generation. However, standard analytical approaches are not designed to be “ome aware.” Thus, some researchers analyze data from one ome at a time, and then combine predictions across omes. Others resort to correlation studies, cataloging pairwise relationships, but lacking an obvious approach for cohesive and interpretable summaries of these catalogs. METHODS: We present a novel workflow for building predictive regression models from network neighborhoods in multi-omic networks. First, we generate pairwise regression models across all pairs of analytes from all omes, encoding the resulting “top table” of relationships in a network. Then, we build predictive logistic regression models using the analytes in network neighborhoods of interest. We call this method CANTARE (Consolidated Analysis of Network Topology And Regression Elements). RESULTS: We applied CANTARE to previously published data from healthy controls and patients with inflammatory bowel disease (IBD) consisting of three omes: gut microbiome, metabolomics, and microbial-derived enzymes. We identified 8 unique predictive models with AUC > 0.90. The number of predictors in these models ranged from 3 to 13. We compare the results of CANTARE to random forests and elastic-net penalized regressions, analyzing AUC, predictions, and predictors. CANTARE AUC values were competitive with those generated by random forests and  penalized regressions. The top 3 CANTARE models had a greater dynamic range of predicted probabilities than did random forests and penalized regressions (p-value = 1.35 × 10(–5)). CANTARE models were significantly more likely to prioritize predictors from multiple omes than were the alternatives (p-value = 0.005). We also showed that predictive models from a network based on pairwise models with an interaction term for IBD have higher AUC than predictive models built from a correlation network (p-value = 0.016). R scripts and a CANTARE User’s Guide are available at https://sourceforge.net/projects/cytomelodics/files/CANTARE/. CONCLUSION: CANTARE offers a flexible approach for building parsimonious, interpretable multi-omic models. These models yield quantitative and directional effect sizes for predictors and support the generation of hypotheses for follow-up investigation. BioMed Central 2021-02-19 /pmc/articles/PMC7896366/ /pubmed/33607938 http://dx.doi.org/10.1186/s12859-021-04016-8 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Siebert, Janet C.
Saint-Cyr, Martine
Borengasser, Sarah J.
Wagner, Brandie D.
Lozupone, Catherine A.
Görg, Carsten
CANTARE: finding and visualizing network-based multi-omic predictive models
title CANTARE: finding and visualizing network-based multi-omic predictive models
title_full CANTARE: finding and visualizing network-based multi-omic predictive models
title_fullStr CANTARE: finding and visualizing network-based multi-omic predictive models
title_full_unstemmed CANTARE: finding and visualizing network-based multi-omic predictive models
title_short CANTARE: finding and visualizing network-based multi-omic predictive models
title_sort cantare: finding and visualizing network-based multi-omic predictive models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7896366/
https://www.ncbi.nlm.nih.gov/pubmed/33607938
http://dx.doi.org/10.1186/s12859-021-04016-8
work_keys_str_mv AT siebertjanetc cantarefindingandvisualizingnetworkbasedmultiomicpredictivemodels
AT saintcyrmartine cantarefindingandvisualizingnetworkbasedmultiomicpredictivemodels
AT borengassersarahj cantarefindingandvisualizingnetworkbasedmultiomicpredictivemodels
AT wagnerbrandied cantarefindingandvisualizingnetworkbasedmultiomicpredictivemodels
AT lozuponecatherinea cantarefindingandvisualizingnetworkbasedmultiomicpredictivemodels
AT gorgcarsten cantarefindingandvisualizingnetworkbasedmultiomicpredictivemodels