Cargando…

COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data

MOTIVATION: The size of available omics datasets is steadily increasing with technological advancement in recent years. While this increase in sample size can be used to improve the performance of relevant prediction tasks in healthcare, models that are optimized for large datasets usually operate a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ditz, Jonas C, Reuter, Bernhard, Pfeifer, Nico
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2023
Materias:	Biomedical Informatics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311322/ https://www.ncbi.nlm.nih.gov/pubmed/37387152 http://dx.doi.org/10.1093/bioinformatics/btad204

_version_	1785066718941413376
author	Ditz, Jonas C Reuter, Bernhard Pfeifer, Nico
author_facet	Ditz, Jonas C Reuter, Bernhard Pfeifer, Nico
author_sort	Ditz, Jonas C
collection	PubMed
description	MOTIVATION: The size of available omics datasets is steadily increasing with technological advancement in recent years. While this increase in sample size can be used to improve the performance of relevant prediction tasks in healthcare, models that are optimized for large datasets usually operate as black boxes. In high-stakes scenarios, like healthcare, using a black-box model poses safety and security issues. Without an explanation about molecular factors and phenotypes that affected the prediction, healthcare providers are left with no choice but to blindly trust the models. We propose a new type of artificial neural network, named Convolutional Omics Kernel Network (COmic). By combining convolutional kernel networks with pathway-induced kernels, our method enables robust and interpretable end-to-end learning on omics datasets ranging in size from a few hundred to several hundreds of thousands of samples. Furthermore, COmic can be easily adapted to utilize multiomics data. RESULTS: We evaluated the performance capabilities of COmic on six different breast cancer cohorts. Additionally, we trained COmic models on multiomics data using the METABRIC cohort. Our models performed either better or similar to competitors on both tasks. We show how the use of pathway-induced Laplacian kernels opens the black-box nature of neural networks and results in intrinsically interpretable models that eliminate the need for post hoc explanation models. AVAILABILITY AND IMPLEMENTATION: Datasets, labels, and pathway-induced graph Laplacians used for the single-omics tasks can be downloaded at https://ibm.ent.box.com/s/ac2ilhyn7xjj27r0xiwtom4crccuobst/folder/48027287036. While datasets and graph Laplacians for the METABRIC cohort can be downloaded from the above mentioned repository, the labels have to be downloaded from cBioPortal at https://www.cbioportal.org/study/clinicalData?id=brca\_metabric. COmic source code as well as all scripts necessary to reproduce the experiments and analysis are publicly available at https://github.com/jditz/comics.
format	Online Article Text
id	pubmed-10311322
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-103113222023-07-01 COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data Ditz, Jonas C Reuter, Bernhard Pfeifer, Nico Bioinformatics Biomedical Informatics MOTIVATION: The size of available omics datasets is steadily increasing with technological advancement in recent years. While this increase in sample size can be used to improve the performance of relevant prediction tasks in healthcare, models that are optimized for large datasets usually operate as black boxes. In high-stakes scenarios, like healthcare, using a black-box model poses safety and security issues. Without an explanation about molecular factors and phenotypes that affected the prediction, healthcare providers are left with no choice but to blindly trust the models. We propose a new type of artificial neural network, named Convolutional Omics Kernel Network (COmic). By combining convolutional kernel networks with pathway-induced kernels, our method enables robust and interpretable end-to-end learning on omics datasets ranging in size from a few hundred to several hundreds of thousands of samples. Furthermore, COmic can be easily adapted to utilize multiomics data. RESULTS: We evaluated the performance capabilities of COmic on six different breast cancer cohorts. Additionally, we trained COmic models on multiomics data using the METABRIC cohort. Our models performed either better or similar to competitors on both tasks. We show how the use of pathway-induced Laplacian kernels opens the black-box nature of neural networks and results in intrinsically interpretable models that eliminate the need for post hoc explanation models. AVAILABILITY AND IMPLEMENTATION: Datasets, labels, and pathway-induced graph Laplacians used for the single-omics tasks can be downloaded at https://ibm.ent.box.com/s/ac2ilhyn7xjj27r0xiwtom4crccuobst/folder/48027287036. While datasets and graph Laplacians for the METABRIC cohort can be downloaded from the above mentioned repository, the labels have to be downloaded from cBioPortal at https://www.cbioportal.org/study/clinicalData?id=brca\_metabric. COmic source code as well as all scripts necessary to reproduce the experiments and analysis are publicly available at https://github.com/jditz/comics. Oxford University Press 2023-06-30 /pmc/articles/PMC10311322/ /pubmed/37387152 http://dx.doi.org/10.1093/bioinformatics/btad204 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Biomedical Informatics Ditz, Jonas C Reuter, Bernhard Pfeifer, Nico COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data
title	COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data
title_full	COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data
title_fullStr	COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data
title_full_unstemmed	COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data
title_short	COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data
title_sort	comic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data
topic	Biomedical Informatics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311322/ https://www.ncbi.nlm.nih.gov/pubmed/37387152 http://dx.doi.org/10.1093/bioinformatics/btad204
work_keys_str_mv	AT ditzjonasc comicconvolutionalkernelnetworksforinterpretableendtoendlearningonmultiomicsdata AT reuterbernhard comicconvolutionalkernelnetworksforinterpretableendtoendlearningonmultiomicsdata AT pfeifernico comicconvolutionalkernelnetworksforinterpretableendtoendlearningonmultiomicsdata

COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data

Ejemplares similares