Cargando…

HetEnc: a deep learning predictive model for multi-type biological dataset

BACKGROUND: Researchers today are generating unprecedented amounts of biological data. One trend in current biological research is integrated analysis with multi-platform data. Effective integration of multi-platform data into the solution of a single or multi-task classification problem; however, i...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Leihong, Liu, Xiangwen, Xu, Joshua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6686264/
https://www.ncbi.nlm.nih.gov/pubmed/31395005
http://dx.doi.org/10.1186/s12864-019-5997-2
_version_ 1783442525537697792
author Wu, Leihong
Liu, Xiangwen
Xu, Joshua
author_facet Wu, Leihong
Liu, Xiangwen
Xu, Joshua
author_sort Wu, Leihong
collection PubMed
description BACKGROUND: Researchers today are generating unprecedented amounts of biological data. One trend in current biological research is integrated analysis with multi-platform data. Effective integration of multi-platform data into the solution of a single or multi-task classification problem; however, is critical and challenging. In this study, we proposed HetEnc, a novel deep learning-based approach, for information domain separation. RESULTS: HetEnc includes both an unsupervised feature representation module and a supervised neural network module to handle multi-platform gene expression datasets. It first constructs three different encoding networks to represent the original gene expression data using high-level abstracted features. A six-layer fully-connected feed-forward neural network is then trained using these abstracted features for each targeted endpoint. We applied HetEnc to the SEQC neuroblastoma dataset to demonstrate that it outperforms other machine learning approaches. Although we used multi-platform data in feature abstraction and model training, HetEnc does not need multi-platform data for prediction, enabling a broader application of the trained model by reducing the cost of gene expression profiling for new samples to a single platform. Thus, HetEnc provides a new solution to integrated gene expression analysis, accelerating modern biological research.
format Online
Article
Text
id pubmed-6686264
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-66862642019-08-12 HetEnc: a deep learning predictive model for multi-type biological dataset Wu, Leihong Liu, Xiangwen Xu, Joshua BMC Genomics Research Article BACKGROUND: Researchers today are generating unprecedented amounts of biological data. One trend in current biological research is integrated analysis with multi-platform data. Effective integration of multi-platform data into the solution of a single or multi-task classification problem; however, is critical and challenging. In this study, we proposed HetEnc, a novel deep learning-based approach, for information domain separation. RESULTS: HetEnc includes both an unsupervised feature representation module and a supervised neural network module to handle multi-platform gene expression datasets. It first constructs three different encoding networks to represent the original gene expression data using high-level abstracted features. A six-layer fully-connected feed-forward neural network is then trained using these abstracted features for each targeted endpoint. We applied HetEnc to the SEQC neuroblastoma dataset to demonstrate that it outperforms other machine learning approaches. Although we used multi-platform data in feature abstraction and model training, HetEnc does not need multi-platform data for prediction, enabling a broader application of the trained model by reducing the cost of gene expression profiling for new samples to a single platform. Thus, HetEnc provides a new solution to integrated gene expression analysis, accelerating modern biological research. BioMed Central 2019-08-08 /pmc/articles/PMC6686264/ /pubmed/31395005 http://dx.doi.org/10.1186/s12864-019-5997-2 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Wu, Leihong
Liu, Xiangwen
Xu, Joshua
HetEnc: a deep learning predictive model for multi-type biological dataset
title HetEnc: a deep learning predictive model for multi-type biological dataset
title_full HetEnc: a deep learning predictive model for multi-type biological dataset
title_fullStr HetEnc: a deep learning predictive model for multi-type biological dataset
title_full_unstemmed HetEnc: a deep learning predictive model for multi-type biological dataset
title_short HetEnc: a deep learning predictive model for multi-type biological dataset
title_sort hetenc: a deep learning predictive model for multi-type biological dataset
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6686264/
https://www.ncbi.nlm.nih.gov/pubmed/31395005
http://dx.doi.org/10.1186/s12864-019-5997-2
work_keys_str_mv AT wuleihong hetencadeeplearningpredictivemodelformultitypebiologicaldataset
AT liuxiangwen hetencadeeplearningpredictivemodelformultitypebiologicaldataset
AT xujoshua hetencadeeplearningpredictivemodelformultitypebiologicaldataset