Cargando…

A cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype

MOTIVATION: An unsolved fundamental problem in biology is to predict phenotypes from a new genotype under environmental perturbations. The emergence of multiple omics data provides new opportunities but imposes great challenges in the predictive modeling of genotype-phenotype associations. Firstly,...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Di, Xie, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8696111/
https://www.ncbi.nlm.nih.gov/pubmed/34390577
http://dx.doi.org/10.1093/bioinformatics/btab580
_version_ 1784619733466742784
author He, Di
Xie, Lei
author_facet He, Di
Xie, Lei
author_sort He, Di
collection PubMed
description MOTIVATION: An unsolved fundamental problem in biology is to predict phenotypes from a new genotype under environmental perturbations. The emergence of multiple omics data provides new opportunities but imposes great challenges in the predictive modeling of genotype-phenotype associations. Firstly, the high-dimensionality of genomics data and the lack of coherent labeled data often make the existing supervised learning techniques less successful. Secondly, it is challenging to integrate heterogeneous omics data from different resources. Finally, few works have explicitly modeled the information transmission from DNA to phenotype, which involves multiple intermediate molecular types. Higher-level features (e.g. gene expression) usually have stronger discriminative and interpretable power than lower-level features (e.g. somatic mutation). RESULTS: We propose a novel Cross-LEvel Information Transmission (CLEIT) network framework to address the above issues. CLEIT aims to represent the asymmetrical multi-level organization of the biological system by integrating multiple incoherent omics data and to improve the prediction power of low-level features. CLEIT first learns the latent representation of the high-level domain then uses it as ground-truth embedding to improve the representation learning of the low-level domain in the form of contrastive loss. Besides, CLEIT can leverage the unlabeled heterogeneous omics data to improve the generalizability of the predictive model. We demonstrate the effectiveness and significant performance boost of CLEIT in predicting anti-cancer drug sensitivity from somatic mutations via the assistance of gene expressions when compared with state-of-the-art methods. CLEIT provides a general framework to model information transmissions and integrate multi-modal data in a multi-level system. AVAILABILITYAND IMPLEMENTATION: The source code is freely available at https://github.com/XieResearchGroup/CLEIT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8696111
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-86961112022-01-04 A cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype He, Di Xie, Lei Bioinformatics Original Papers MOTIVATION: An unsolved fundamental problem in biology is to predict phenotypes from a new genotype under environmental perturbations. The emergence of multiple omics data provides new opportunities but imposes great challenges in the predictive modeling of genotype-phenotype associations. Firstly, the high-dimensionality of genomics data and the lack of coherent labeled data often make the existing supervised learning techniques less successful. Secondly, it is challenging to integrate heterogeneous omics data from different resources. Finally, few works have explicitly modeled the information transmission from DNA to phenotype, which involves multiple intermediate molecular types. Higher-level features (e.g. gene expression) usually have stronger discriminative and interpretable power than lower-level features (e.g. somatic mutation). RESULTS: We propose a novel Cross-LEvel Information Transmission (CLEIT) network framework to address the above issues. CLEIT aims to represent the asymmetrical multi-level organization of the biological system by integrating multiple incoherent omics data and to improve the prediction power of low-level features. CLEIT first learns the latent representation of the high-level domain then uses it as ground-truth embedding to improve the representation learning of the low-level domain in the form of contrastive loss. Besides, CLEIT can leverage the unlabeled heterogeneous omics data to improve the generalizability of the predictive model. We demonstrate the effectiveness and significant performance boost of CLEIT in predicting anti-cancer drug sensitivity from somatic mutations via the assistance of gene expressions when compared with state-of-the-art methods. CLEIT provides a general framework to model information transmissions and integrate multi-modal data in a multi-level system. AVAILABILITYAND IMPLEMENTATION: The source code is freely available at https://github.com/XieResearchGroup/CLEIT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-08-15 /pmc/articles/PMC8696111/ /pubmed/34390577 http://dx.doi.org/10.1093/bioinformatics/btab580 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
He, Di
Xie, Lei
A cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype
title A cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype
title_full A cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype
title_fullStr A cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype
title_full_unstemmed A cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype
title_short A cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype
title_sort cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8696111/
https://www.ncbi.nlm.nih.gov/pubmed/34390577
http://dx.doi.org/10.1093/bioinformatics/btab580
work_keys_str_mv AT hedi acrosslevelinformationtransmissionnetworkforhierarchicalomicsdataintegrationandphenotypepredictionfromanewgenotype
AT xielei acrosslevelinformationtransmissionnetworkforhierarchicalomicsdataintegrationandphenotypepredictionfromanewgenotype
AT hedi crosslevelinformationtransmissionnetworkforhierarchicalomicsdataintegrationandphenotypepredictionfromanewgenotype
AT xielei crosslevelinformationtransmissionnetworkforhierarchicalomicsdataintegrationandphenotypepredictionfromanewgenotype