Cargando…

Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis

BACKGROUND: Breast cancer is the most prevalent and among the most deadly cancers in females. Patients with breast cancer have highly variable survival lengths, indicating a need to identify prognostic biomarkers for personalized diagnosis and treatment. With the development of new technologies such...

Descripción completa

Detalles Bibliográficos
Autores principales: Tong, Li, Mitchel, Jonathan, Chatlin, Kevin, Wang, May D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7493161/
https://www.ncbi.nlm.nih.gov/pubmed/32933515
http://dx.doi.org/10.1186/s12911-020-01225-8
_version_ 1783582510050967552
author Tong, Li
Mitchel, Jonathan
Chatlin, Kevin
Wang, May D.
author_facet Tong, Li
Mitchel, Jonathan
Chatlin, Kevin
Wang, May D.
author_sort Tong, Li
collection PubMed
description BACKGROUND: Breast cancer is the most prevalent and among the most deadly cancers in females. Patients with breast cancer have highly variable survival lengths, indicating a need to identify prognostic biomarkers for personalized diagnosis and treatment. With the development of new technologies such as next-generation sequencing, multi-omics information are becoming available for a more thorough evaluation of a patient’s condition. In this study, we aim to improve breast cancer overall survival prediction by integrating multi-omics data (e.g., gene expression, DNA methylation, miRNA expression, and copy number variations (CNVs)). METHODS: Motivated by multi-view learning, we propose a novel strategy to integrate multi-omics data for breast cancer survival prediction by applying complementary and consensus principles. The complementary principle assumes each -omics data contains modality-unique information. To preserve such information, we develop a concatenation autoencoder (ConcatAE) that concatenates the hidden features learned from each modality for integration. The consensus principle assumes that the disagreements among modalities upper bound the model errors. To get rid of the noises or discrepancies among modalities, we develop a cross-modality autoencoder (CrossAE) to maximize the agreement among modalities to achieve a modality-invariant representation. We first validate the effectiveness of our proposed models on the MNIST simulated data. We then apply these models to the TCCA breast cancer multi-omics data for overall survival prediction. RESULTS: For breast cancer overall survival prediction, the integration of DNA methylation and miRNA expression achieves the best overall performance of 0.641 ± 0.031 with ConcatAE, and 0.63 ± 0.081 with CrossAE. Both strategies outperform baseline single-modality models using only DNA methylation (0.583 ± 0.058) or miRNA expression (0.616 ± 0.057). CONCLUSIONS: In conclusion, we achieve improved overall survival prediction performance by utilizing either the complementary or consensus information among multi-omics data. The proposed ConcatAE and CrossAE models can inspire future deep representation-based multi-omics integration techniques. We believe these novel multi-omics integration models can benefit the personalized diagnosis and treatment of breast cancer patients.
format Online
Article
Text
id pubmed-7493161
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-74931612020-09-16 Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis Tong, Li Mitchel, Jonathan Chatlin, Kevin Wang, May D. BMC Med Inform Decis Mak Research Article BACKGROUND: Breast cancer is the most prevalent and among the most deadly cancers in females. Patients with breast cancer have highly variable survival lengths, indicating a need to identify prognostic biomarkers for personalized diagnosis and treatment. With the development of new technologies such as next-generation sequencing, multi-omics information are becoming available for a more thorough evaluation of a patient’s condition. In this study, we aim to improve breast cancer overall survival prediction by integrating multi-omics data (e.g., gene expression, DNA methylation, miRNA expression, and copy number variations (CNVs)). METHODS: Motivated by multi-view learning, we propose a novel strategy to integrate multi-omics data for breast cancer survival prediction by applying complementary and consensus principles. The complementary principle assumes each -omics data contains modality-unique information. To preserve such information, we develop a concatenation autoencoder (ConcatAE) that concatenates the hidden features learned from each modality for integration. The consensus principle assumes that the disagreements among modalities upper bound the model errors. To get rid of the noises or discrepancies among modalities, we develop a cross-modality autoencoder (CrossAE) to maximize the agreement among modalities to achieve a modality-invariant representation. We first validate the effectiveness of our proposed models on the MNIST simulated data. We then apply these models to the TCCA breast cancer multi-omics data for overall survival prediction. RESULTS: For breast cancer overall survival prediction, the integration of DNA methylation and miRNA expression achieves the best overall performance of 0.641 ± 0.031 with ConcatAE, and 0.63 ± 0.081 with CrossAE. Both strategies outperform baseline single-modality models using only DNA methylation (0.583 ± 0.058) or miRNA expression (0.616 ± 0.057). CONCLUSIONS: In conclusion, we achieve improved overall survival prediction performance by utilizing either the complementary or consensus information among multi-omics data. The proposed ConcatAE and CrossAE models can inspire future deep representation-based multi-omics integration techniques. We believe these novel multi-omics integration models can benefit the personalized diagnosis and treatment of breast cancer patients. BioMed Central 2020-09-15 /pmc/articles/PMC7493161/ /pubmed/32933515 http://dx.doi.org/10.1186/s12911-020-01225-8 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Tong, Li
Mitchel, Jonathan
Chatlin, Kevin
Wang, May D.
Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis
title Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis
title_full Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis
title_fullStr Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis
title_full_unstemmed Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis
title_short Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis
title_sort deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7493161/
https://www.ncbi.nlm.nih.gov/pubmed/32933515
http://dx.doi.org/10.1186/s12911-020-01225-8
work_keys_str_mv AT tongli deeplearningbasedfeaturelevelintegrationofmultiomicsdataforbreastcancerpatientssurvivalanalysis
AT mitcheljonathan deeplearningbasedfeaturelevelintegrationofmultiomicsdataforbreastcancerpatientssurvivalanalysis
AT chatlinkevin deeplearningbasedfeaturelevelintegrationofmultiomicsdataforbreastcancerpatientssurvivalanalysis
AT wangmayd deeplearningbasedfeaturelevelintegrationofmultiomicsdataforbreastcancerpatientssurvivalanalysis