Cargando…

Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classification

Hematopoietic cancer is a malignant transformation in immune system cells. Hematopoietic cancer is characterized by the cells that are expressed, so it is usually difficult to distinguish its heterogeneities in the hematopoiesis process. Traditional approaches for cancer subtyping use statistical te...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Kwang Ho, Batbaatar, Erdenebileg, Piao, Yongjun, Theera-Umpon, Nipon, Ryu, Keun Ho
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7926954/
https://www.ncbi.nlm.nih.gov/pubmed/33672300
http://dx.doi.org/10.3390/ijerph18042197
_version_ 1783659581329637376
author Park, Kwang Ho
Batbaatar, Erdenebileg
Piao, Yongjun
Theera-Umpon, Nipon
Ryu, Keun Ho
author_facet Park, Kwang Ho
Batbaatar, Erdenebileg
Piao, Yongjun
Theera-Umpon, Nipon
Ryu, Keun Ho
author_sort Park, Kwang Ho
collection PubMed
description Hematopoietic cancer is a malignant transformation in immune system cells. Hematopoietic cancer is characterized by the cells that are expressed, so it is usually difficult to distinguish its heterogeneities in the hematopoiesis process. Traditional approaches for cancer subtyping use statistical techniques. Furthermore, due to the overfitting problem of small samples, in case of a minor cancer, it does not have enough sample material for building a classification model. Therefore, we propose not only to build a classification model for five major subtypes using two kinds of losses, namely reconstruction loss and classification loss, but also to extract suitable features using a deep autoencoder. Furthermore, for considering the data imbalance problem, we apply an oversampling algorithm, the synthetic minority oversampling technique (SMOTE). For validation of our proposed autoencoder-based feature extraction approach for hematopoietic cancer subtype classification, we compared other traditional feature selection algorithms (principal component analysis, non-negative matrix factorization) and classification algorithms with the SMOTE oversampling approach. Additionally, we used the Shapley Additive exPlanations (SHAP) interpretation technique in our model to explain the important gene/protein for hematopoietic cancer subtype classification. Furthermore, we compared five widely used classification algorithms, including logistic regression, random forest, k-nearest neighbor, artificial neural network and support vector machine. The results of autoencoder-based feature extraction approaches showed good performance, and the best result was the SMOTE oversampling-applied support vector machine algorithm consider both focal loss and reconstruction loss as the loss function for autoencoder (AE) feature selection approach, which produced 97.01% accuracy, 92.60% recall, 99.52% specificity, 93.54% F1-measure, 97.87% G-mean and 95.46% index of balanced accuracy as subtype classification performance measures.
format Online
Article
Text
id pubmed-7926954
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-79269542021-03-04 Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classification Park, Kwang Ho Batbaatar, Erdenebileg Piao, Yongjun Theera-Umpon, Nipon Ryu, Keun Ho Int J Environ Res Public Health Article Hematopoietic cancer is a malignant transformation in immune system cells. Hematopoietic cancer is characterized by the cells that are expressed, so it is usually difficult to distinguish its heterogeneities in the hematopoiesis process. Traditional approaches for cancer subtyping use statistical techniques. Furthermore, due to the overfitting problem of small samples, in case of a minor cancer, it does not have enough sample material for building a classification model. Therefore, we propose not only to build a classification model for five major subtypes using two kinds of losses, namely reconstruction loss and classification loss, but also to extract suitable features using a deep autoencoder. Furthermore, for considering the data imbalance problem, we apply an oversampling algorithm, the synthetic minority oversampling technique (SMOTE). For validation of our proposed autoencoder-based feature extraction approach for hematopoietic cancer subtype classification, we compared other traditional feature selection algorithms (principal component analysis, non-negative matrix factorization) and classification algorithms with the SMOTE oversampling approach. Additionally, we used the Shapley Additive exPlanations (SHAP) interpretation technique in our model to explain the important gene/protein for hematopoietic cancer subtype classification. Furthermore, we compared five widely used classification algorithms, including logistic regression, random forest, k-nearest neighbor, artificial neural network and support vector machine. The results of autoencoder-based feature extraction approaches showed good performance, and the best result was the SMOTE oversampling-applied support vector machine algorithm consider both focal loss and reconstruction loss as the loss function for autoencoder (AE) feature selection approach, which produced 97.01% accuracy, 92.60% recall, 99.52% specificity, 93.54% F1-measure, 97.87% G-mean and 95.46% index of balanced accuracy as subtype classification performance measures. MDPI 2021-02-23 2021-02 /pmc/articles/PMC7926954/ /pubmed/33672300 http://dx.doi.org/10.3390/ijerph18042197 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Park, Kwang Ho
Batbaatar, Erdenebileg
Piao, Yongjun
Theera-Umpon, Nipon
Ryu, Keun Ho
Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classification
title Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classification
title_full Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classification
title_fullStr Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classification
title_full_unstemmed Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classification
title_short Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classification
title_sort deep learning feature extraction approach for hematopoietic cancer subtype classification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7926954/
https://www.ncbi.nlm.nih.gov/pubmed/33672300
http://dx.doi.org/10.3390/ijerph18042197
work_keys_str_mv AT parkkwangho deeplearningfeatureextractionapproachforhematopoieticcancersubtypeclassification
AT batbaatarerdenebileg deeplearningfeatureextractionapproachforhematopoieticcancersubtypeclassification
AT piaoyongjun deeplearningfeatureextractionapproachforhematopoieticcancersubtypeclassification
AT theeraumponnipon deeplearningfeatureextractionapproachforhematopoieticcancersubtypeclassification
AT ryukeunho deeplearningfeatureextractionapproachforhematopoieticcancersubtypeclassification