Cargando…

Tumor gene expression data classification via sample expansion-based deep learning

Since tumor is seriously harmful to human health, effective diagnosis measures are in urgent need for tumor therapy. Early detection of tumor is particularly important for better treatment of patients. A notable issue is how to effectively discriminate tumor samples from normal ones. Many classifica...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Jian, Wang, Xuesong, Cheng, Yuhu, Zhang, Lin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Impact Journals LLC 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5752549/
https://www.ncbi.nlm.nih.gov/pubmed/29312636
http://dx.doi.org/10.18632/oncotarget.22762
_version_ 1783290129630101504
author Liu, Jian
Wang, Xuesong
Cheng, Yuhu
Zhang, Lin
author_facet Liu, Jian
Wang, Xuesong
Cheng, Yuhu
Zhang, Lin
author_sort Liu, Jian
collection PubMed
description Since tumor is seriously harmful to human health, effective diagnosis measures are in urgent need for tumor therapy. Early detection of tumor is particularly important for better treatment of patients. A notable issue is how to effectively discriminate tumor samples from normal ones. Many classification methods, such as Support Vector Machines (SVMs), have been proposed for tumor classification. Recently, deep learning has achieved satisfactory performance in the classification task of many areas. However, the application of deep learning is rare in tumor classification due to insufficient training samples of gene expression data. In this paper, a Sample Expansion method is proposed to address the problem. Inspired by the idea of Denoising Autoencoder (DAE), a large number of samples are obtained by randomly cleaning partially corrupted input many times. The expanded samples can not only maintain the merits of corrupted data in DAE but also deal with the problem of insufficient training samples of gene expression data to a certain extent. Since Stacked Autoencoder (SAE) and Convolutional Neural Network (CNN) models show excellent performance in classification task, the applicability of SAE and 1-dimensional CNN (1DCNN) on gene expression data is analyzed. Finally, two deep learning models, Sample Expansion-Based SAE (SESAE) and Sample Expansion-Based 1DCNN (SE1DCNN), are designed to carry out tumor gene expression data classification by using the expanded samples. Experimental studies indicate that SESAE and SE1DCNN are very effective in tumor classification.
format Online
Article
Text
id pubmed-5752549
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Impact Journals LLC
record_format MEDLINE/PubMed
spelling pubmed-57525492018-01-08 Tumor gene expression data classification via sample expansion-based deep learning Liu, Jian Wang, Xuesong Cheng, Yuhu Zhang, Lin Oncotarget Research Paper Since tumor is seriously harmful to human health, effective diagnosis measures are in urgent need for tumor therapy. Early detection of tumor is particularly important for better treatment of patients. A notable issue is how to effectively discriminate tumor samples from normal ones. Many classification methods, such as Support Vector Machines (SVMs), have been proposed for tumor classification. Recently, deep learning has achieved satisfactory performance in the classification task of many areas. However, the application of deep learning is rare in tumor classification due to insufficient training samples of gene expression data. In this paper, a Sample Expansion method is proposed to address the problem. Inspired by the idea of Denoising Autoencoder (DAE), a large number of samples are obtained by randomly cleaning partially corrupted input many times. The expanded samples can not only maintain the merits of corrupted data in DAE but also deal with the problem of insufficient training samples of gene expression data to a certain extent. Since Stacked Autoencoder (SAE) and Convolutional Neural Network (CNN) models show excellent performance in classification task, the applicability of SAE and 1-dimensional CNN (1DCNN) on gene expression data is analyzed. Finally, two deep learning models, Sample Expansion-Based SAE (SESAE) and Sample Expansion-Based 1DCNN (SE1DCNN), are designed to carry out tumor gene expression data classification by using the expanded samples. Experimental studies indicate that SESAE and SE1DCNN are very effective in tumor classification. Impact Journals LLC 2017-11-30 /pmc/articles/PMC5752549/ /pubmed/29312636 http://dx.doi.org/10.18632/oncotarget.22762 Text en Copyright: © 2017 Liu et al. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/) 3.0 (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Paper
Liu, Jian
Wang, Xuesong
Cheng, Yuhu
Zhang, Lin
Tumor gene expression data classification via sample expansion-based deep learning
title Tumor gene expression data classification via sample expansion-based deep learning
title_full Tumor gene expression data classification via sample expansion-based deep learning
title_fullStr Tumor gene expression data classification via sample expansion-based deep learning
title_full_unstemmed Tumor gene expression data classification via sample expansion-based deep learning
title_short Tumor gene expression data classification via sample expansion-based deep learning
title_sort tumor gene expression data classification via sample expansion-based deep learning
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5752549/
https://www.ncbi.nlm.nih.gov/pubmed/29312636
http://dx.doi.org/10.18632/oncotarget.22762
work_keys_str_mv AT liujian tumorgeneexpressiondataclassificationviasampleexpansionbaseddeeplearning
AT wangxuesong tumorgeneexpressiondataclassificationviasampleexpansionbaseddeeplearning
AT chengyuhu tumorgeneexpressiondataclassificationviasampleexpansionbaseddeeplearning
AT zhanglin tumorgeneexpressiondataclassificationviasampleexpansionbaseddeeplearning