Cargando…

A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction

BACKGROUND: Recent years have seen a surge of novel neural network architectures for the integration of multi-omics data for prediction. Most of the architectures include either encoders alone or encoders and decoders, i.e., autoencoders of various sorts, to transform multi-omics data into latent re...

Descripción completa

Detalles Bibliográficos
Autores principales: Hauptmann, Tony, Kramer, Stefan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9926634/
https://www.ncbi.nlm.nih.gov/pubmed/36788531
http://dx.doi.org/10.1186/s12859-023-05166-7
_version_ 1784888320632815616
author Hauptmann, Tony
Kramer, Stefan
author_facet Hauptmann, Tony
Kramer, Stefan
author_sort Hauptmann, Tony
collection PubMed
description BACKGROUND: Recent years have seen a surge of novel neural network architectures for the integration of multi-omics data for prediction. Most of the architectures include either encoders alone or encoders and decoders, i.e., autoencoders of various sorts, to transform multi-omics data into latent representations. One important parameter is the depth of integration: the point at which the latent representations are computed or merged, which can be either early, intermediate, or late. The literature on integration methods is growing steadily, however, close to nothing is known about the relative performance of these methods under fair experimental conditions and under consideration of different use cases. RESULTS: We developed a comparison framework that trains and optimizes multi-omics integration methods under equal conditions. We incorporated early integration, PCA and four recently published deep learning methods: MOLI, Super.FELT, OmiEmbed, and MOMA. Further, we devised a novel method, Omics Stacking, that combines the advantages of intermediate and late integration. Experiments were conducted on a public drug response data set with multiple omics data (somatic point mutations, somatic copy number profiles and gene expression profiles) that was obtained from cell lines, patient-derived xenografts, and patient samples. Our experiments confirmed that early integration has the lowest predictive performance. Overall, architectures that integrate triplet loss achieved the best results. Statistical differences can, overall, rarely be observed, however, in terms of the average ranks of methods, Super.FELT is consistently performing best in a cross-validation setting and Omics Stacking best in an external test set setting. CONCLUSIONS: We recommend researchers to follow fair comparison protocols, as suggested in the paper. When faced with a new data set, Super.FELT is a good option in the cross-validation setting as well as Omics Stacking in the external test set setting. Statistical significances are hardly observable, despite trends in the algorithms’ rankings. Future work on refined methods for transfer learning tailored for this domain may improve the situation for external test sets. The source code of all experiments is available under https://github.com/kramerlab/Multi-Omics_analysis SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05166-7.
format Online
Article
Text
id pubmed-9926634
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-99266342023-02-15 A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction Hauptmann, Tony Kramer, Stefan BMC Bioinformatics Research BACKGROUND: Recent years have seen a surge of novel neural network architectures for the integration of multi-omics data for prediction. Most of the architectures include either encoders alone or encoders and decoders, i.e., autoencoders of various sorts, to transform multi-omics data into latent representations. One important parameter is the depth of integration: the point at which the latent representations are computed or merged, which can be either early, intermediate, or late. The literature on integration methods is growing steadily, however, close to nothing is known about the relative performance of these methods under fair experimental conditions and under consideration of different use cases. RESULTS: We developed a comparison framework that trains and optimizes multi-omics integration methods under equal conditions. We incorporated early integration, PCA and four recently published deep learning methods: MOLI, Super.FELT, OmiEmbed, and MOMA. Further, we devised a novel method, Omics Stacking, that combines the advantages of intermediate and late integration. Experiments were conducted on a public drug response data set with multiple omics data (somatic point mutations, somatic copy number profiles and gene expression profiles) that was obtained from cell lines, patient-derived xenografts, and patient samples. Our experiments confirmed that early integration has the lowest predictive performance. Overall, architectures that integrate triplet loss achieved the best results. Statistical differences can, overall, rarely be observed, however, in terms of the average ranks of methods, Super.FELT is consistently performing best in a cross-validation setting and Omics Stacking best in an external test set setting. CONCLUSIONS: We recommend researchers to follow fair comparison protocols, as suggested in the paper. When faced with a new data set, Super.FELT is a good option in the cross-validation setting as well as Omics Stacking in the external test set setting. Statistical significances are hardly observable, despite trends in the algorithms’ rankings. Future work on refined methods for transfer learning tailored for this domain may improve the situation for external test sets. The source code of all experiments is available under https://github.com/kramerlab/Multi-Omics_analysis SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05166-7. BioMed Central 2023-02-14 /pmc/articles/PMC9926634/ /pubmed/36788531 http://dx.doi.org/10.1186/s12859-023-05166-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Hauptmann, Tony
Kramer, Stefan
A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction
title A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction
title_full A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction
title_fullStr A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction
title_full_unstemmed A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction
title_short A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction
title_sort fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9926634/
https://www.ncbi.nlm.nih.gov/pubmed/36788531
http://dx.doi.org/10.1186/s12859-023-05166-7
work_keys_str_mv AT hauptmanntony afairexperimentalcomparisonofneuralnetworkarchitecturesforlatentrepresentationsofmultiomicsfordrugresponseprediction
AT kramerstefan afairexperimentalcomparisonofneuralnetworkarchitecturesforlatentrepresentationsofmultiomicsfordrugresponseprediction
AT hauptmanntony fairexperimentalcomparisonofneuralnetworkarchitecturesforlatentrepresentationsofmultiomicsfordrugresponseprediction
AT kramerstefan fairexperimentalcomparisonofneuralnetworkarchitecturesforlatentrepresentationsofmultiomicsfordrugresponseprediction