Cargando…

Variational Autoencoders for Cancer Data Integration: Design Principles and Computational Practice

International initiatives such as the Molecular Taxonomy of Breast Cancer International Consortium are collecting multiple data sets at different genome-scales with the aim to identify novel cancer bio-markers and predict patient survival. To analyze such data, several machine learning, bioinformati...

Descripción completa

Detalles Bibliográficos
Autores principales: Simidjievski, Nikola, Bodnar, Cristian, Tariq, Ifrah, Scherer, Paul, Andres Terre, Helena, Shams, Zohreh, Jamnik, Mateja, Liò, Pietro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6917668/
https://www.ncbi.nlm.nih.gov/pubmed/31921281
http://dx.doi.org/10.3389/fgene.2019.01205
_version_ 1783480449368064000
author Simidjievski, Nikola
Bodnar, Cristian
Tariq, Ifrah
Scherer, Paul
Andres Terre, Helena
Shams, Zohreh
Jamnik, Mateja
Liò, Pietro
author_facet Simidjievski, Nikola
Bodnar, Cristian
Tariq, Ifrah
Scherer, Paul
Andres Terre, Helena
Shams, Zohreh
Jamnik, Mateja
Liò, Pietro
author_sort Simidjievski, Nikola
collection PubMed
description International initiatives such as the Molecular Taxonomy of Breast Cancer International Consortium are collecting multiple data sets at different genome-scales with the aim to identify novel cancer bio-markers and predict patient survival. To analyze such data, several machine learning, bioinformatics, and statistical methods have been applied, among them neural networks such as autoencoders. Although these models provide a good statistical learning framework to analyze multi-omic and/or clinical data, there is a distinct lack of work on how to integrate diverse patient data and identify the optimal design best suited to the available data.In this paper, we investigate several autoencoder architectures that integrate a variety of cancer patient data types (e.g., multi-omics and clinical data). We perform extensive analyses of these approaches and provide a clear methodological and computational framework for designing systems that enable clinicians to investigate cancer traits and translate the results into clinical applications. We demonstrate how these networks can be designed, built, and, in particular, applied to tasks of integrative analyses of heterogeneous breast cancer data. The results show that these approaches yield relevant data representations that, in turn, lead to accurate and stable diagnosis.
format Online
Article
Text
id pubmed-6917668
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-69176682020-01-09 Variational Autoencoders for Cancer Data Integration: Design Principles and Computational Practice Simidjievski, Nikola Bodnar, Cristian Tariq, Ifrah Scherer, Paul Andres Terre, Helena Shams, Zohreh Jamnik, Mateja Liò, Pietro Front Genet Genetics International initiatives such as the Molecular Taxonomy of Breast Cancer International Consortium are collecting multiple data sets at different genome-scales with the aim to identify novel cancer bio-markers and predict patient survival. To analyze such data, several machine learning, bioinformatics, and statistical methods have been applied, among them neural networks such as autoencoders. Although these models provide a good statistical learning framework to analyze multi-omic and/or clinical data, there is a distinct lack of work on how to integrate diverse patient data and identify the optimal design best suited to the available data.In this paper, we investigate several autoencoder architectures that integrate a variety of cancer patient data types (e.g., multi-omics and clinical data). We perform extensive analyses of these approaches and provide a clear methodological and computational framework for designing systems that enable clinicians to investigate cancer traits and translate the results into clinical applications. We demonstrate how these networks can be designed, built, and, in particular, applied to tasks of integrative analyses of heterogeneous breast cancer data. The results show that these approaches yield relevant data representations that, in turn, lead to accurate and stable diagnosis. Frontiers Media S.A. 2019-12-11 /pmc/articles/PMC6917668/ /pubmed/31921281 http://dx.doi.org/10.3389/fgene.2019.01205 Text en Copyright © 2019 Simidjievski, Bodnar, Tariq, Scherer, Andres Terre, Shams, Jamnik and Liò http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Simidjievski, Nikola
Bodnar, Cristian
Tariq, Ifrah
Scherer, Paul
Andres Terre, Helena
Shams, Zohreh
Jamnik, Mateja
Liò, Pietro
Variational Autoencoders for Cancer Data Integration: Design Principles and Computational Practice
title Variational Autoencoders for Cancer Data Integration: Design Principles and Computational Practice
title_full Variational Autoencoders for Cancer Data Integration: Design Principles and Computational Practice
title_fullStr Variational Autoencoders for Cancer Data Integration: Design Principles and Computational Practice
title_full_unstemmed Variational Autoencoders for Cancer Data Integration: Design Principles and Computational Practice
title_short Variational Autoencoders for Cancer Data Integration: Design Principles and Computational Practice
title_sort variational autoencoders for cancer data integration: design principles and computational practice
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6917668/
https://www.ncbi.nlm.nih.gov/pubmed/31921281
http://dx.doi.org/10.3389/fgene.2019.01205
work_keys_str_mv AT simidjievskinikola variationalautoencodersforcancerdataintegrationdesignprinciplesandcomputationalpractice
AT bodnarcristian variationalautoencodersforcancerdataintegrationdesignprinciplesandcomputationalpractice
AT tariqifrah variationalautoencodersforcancerdataintegrationdesignprinciplesandcomputationalpractice
AT schererpaul variationalautoencodersforcancerdataintegrationdesignprinciplesandcomputationalpractice
AT andresterrehelena variationalautoencodersforcancerdataintegrationdesignprinciplesandcomputationalpractice
AT shamszohreh variationalautoencodersforcancerdataintegrationdesignprinciplesandcomputationalpractice
AT jamnikmateja variationalautoencodersforcancerdataintegrationdesignprinciplesandcomputationalpractice
AT liopietro variationalautoencodersforcancerdataintegrationdesignprinciplesandcomputationalpractice