Cargando…

iTOP: inferring the topology of omics data

MOTIVATION: In biology, we are often faced with multiple datasets recorded on the same set of objects, such as multi-omics and phenotypic data of the same tumors. These datasets are typically not independent from each other. For example, methylation may influence gene expression, which may, in turn,...

Descripción completa

Detalles Bibliográficos
Autores principales: Aben, Nanne, Westerhuis, Johan A, Song, Yipeng, Kiers, Henk A L, Michaut, Magali, Smilde, Age K, Wessels, Lodewyk F A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129292/
https://www.ncbi.nlm.nih.gov/pubmed/30423084
http://dx.doi.org/10.1093/bioinformatics/bty636
_version_ 1783353775443935232
author Aben, Nanne
Westerhuis, Johan A
Song, Yipeng
Kiers, Henk A L
Michaut, Magali
Smilde, Age K
Wessels, Lodewyk F A
author_facet Aben, Nanne
Westerhuis, Johan A
Song, Yipeng
Kiers, Henk A L
Michaut, Magali
Smilde, Age K
Wessels, Lodewyk F A
author_sort Aben, Nanne
collection PubMed
description MOTIVATION: In biology, we are often faced with multiple datasets recorded on the same set of objects, such as multi-omics and phenotypic data of the same tumors. These datasets are typically not independent from each other. For example, methylation may influence gene expression, which may, in turn, influence drug response. Such relationships can strongly affect analyses performed on the data, as we have previously shown for the identification of biomarkers of drug response. Therefore, it is important to be able to chart the relationships between datasets. RESULTS: We present iTOP, a methodology to infer a topology of relationships between datasets. We base this methodology on the RV coefficient, a measure of matrix correlation, which can be used to determine how much information is shared between two datasets. We extended the RV coefficient for partial matrix correlations, which allows the use of graph reconstruction algorithms, such as the PC algorithm, to infer the topologies. In addition, since multi-omics data often contain binary data (e.g. mutations), we also extended the RV coefficient for binary data. Applying iTOP to pharmacogenomics data, we found that gene expression acts as a mediator between most other datasets and drug response: only proteomics clearly shares information with drug response that is not present in gene expression. Based on this result, we used TANDEM, a method for drug response prediction, to identify which variables predictive of drug response were distinct to either gene expression or proteomics. AVAILABILITY AND IMPLEMENTATION: An implementation of our methodology is available in the R package iTOP on CRAN. Additionally, an R Markdown document with code to reproduce all figures is provided as Supplementary Material. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6129292
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-61292922018-09-12 iTOP: inferring the topology of omics data Aben, Nanne Westerhuis, Johan A Song, Yipeng Kiers, Henk A L Michaut, Magali Smilde, Age K Wessels, Lodewyk F A Bioinformatics Eccb 2018: European Conference on Computational Biology Proceedings MOTIVATION: In biology, we are often faced with multiple datasets recorded on the same set of objects, such as multi-omics and phenotypic data of the same tumors. These datasets are typically not independent from each other. For example, methylation may influence gene expression, which may, in turn, influence drug response. Such relationships can strongly affect analyses performed on the data, as we have previously shown for the identification of biomarkers of drug response. Therefore, it is important to be able to chart the relationships between datasets. RESULTS: We present iTOP, a methodology to infer a topology of relationships between datasets. We base this methodology on the RV coefficient, a measure of matrix correlation, which can be used to determine how much information is shared between two datasets. We extended the RV coefficient for partial matrix correlations, which allows the use of graph reconstruction algorithms, such as the PC algorithm, to infer the topologies. In addition, since multi-omics data often contain binary data (e.g. mutations), we also extended the RV coefficient for binary data. Applying iTOP to pharmacogenomics data, we found that gene expression acts as a mediator between most other datasets and drug response: only proteomics clearly shares information with drug response that is not present in gene expression. Based on this result, we used TANDEM, a method for drug response prediction, to identify which variables predictive of drug response were distinct to either gene expression or proteomics. AVAILABILITY AND IMPLEMENTATION: An implementation of our methodology is available in the R package iTOP on CRAN. Additionally, an R Markdown document with code to reproduce all figures is provided as Supplementary Material. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-09-01 2018-09-08 /pmc/articles/PMC6129292/ /pubmed/30423084 http://dx.doi.org/10.1093/bioinformatics/bty636 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Eccb 2018: European Conference on Computational Biology Proceedings
Aben, Nanne
Westerhuis, Johan A
Song, Yipeng
Kiers, Henk A L
Michaut, Magali
Smilde, Age K
Wessels, Lodewyk F A
iTOP: inferring the topology of omics data
title iTOP: inferring the topology of omics data
title_full iTOP: inferring the topology of omics data
title_fullStr iTOP: inferring the topology of omics data
title_full_unstemmed iTOP: inferring the topology of omics data
title_short iTOP: inferring the topology of omics data
title_sort itop: inferring the topology of omics data
topic Eccb 2018: European Conference on Computational Biology Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129292/
https://www.ncbi.nlm.nih.gov/pubmed/30423084
http://dx.doi.org/10.1093/bioinformatics/bty636
work_keys_str_mv AT abennanne itopinferringthetopologyofomicsdata
AT westerhuisjohana itopinferringthetopologyofomicsdata
AT songyipeng itopinferringthetopologyofomicsdata
AT kiershenkal itopinferringthetopologyofomicsdata
AT michautmagali itopinferringthetopologyofomicsdata
AT smildeagek itopinferringthetopologyofomicsdata
AT wesselslodewykfa itopinferringthetopologyofomicsdata