Cargando…

Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN

BACKGROUND: Imbalanced data classification is an inevitable problem in medical intelligent diagnosis. Most of real-world biomedical datasets are usually along with limited samples and high-dimensional feature. This seriously affects the classification performance of the model and causes erroneous gu...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Liyuan, Yang, Huamin, Jiang, Zhengang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6280414/
https://www.ncbi.nlm.nih.gov/pubmed/30514298
http://dx.doi.org/10.1186/s12938-018-0604-3
_version_ 1783378666682580992
author Zhang, Liyuan
Yang, Huamin
Jiang, Zhengang
author_facet Zhang, Liyuan
Yang, Huamin
Jiang, Zhengang
author_sort Zhang, Liyuan
collection PubMed
description BACKGROUND: Imbalanced data classification is an inevitable problem in medical intelligent diagnosis. Most of real-world biomedical datasets are usually along with limited samples and high-dimensional feature. This seriously affects the classification performance of the model and causes erroneous guidance for the diagnosis of diseases. Exploring an effective classification method for imbalanced and limited biomedical dataset is a challenging task. METHODS: In this paper, we propose a novel multilayer extreme learning machine (ELM) classification model combined with dynamic generative adversarial net (GAN) to tackle limited and imbalanced biomedical data. Firstly, principal component analysis is utilized to remove irrelevant and redundant features. Meanwhile, more meaningful pathological features are extracted. After that, dynamic GAN is designed to generate the realistic-looking minority class samples, thereby balancing the class distribution and avoiding overfitting effectively. Finally, a self-adaptive multilayer ELM is proposed to classify the balanced dataset. The analytic expression for the numbers of hidden layer and node is determined by quantitatively establishing the relationship between the change of imbalance ratio and the hyper-parameters of the model. Reducing interactive parameters adjustment makes the classification model more robust. RESULTS: To evaluate the classification performance of the proposed method, numerical experiments are conducted on four real-world biomedical datasets. The proposed method can generate authentic minority class samples and self-adaptively select the optimal parameters of learning model. By comparing with W-ELM, SMOTE-ELM, and H-ELM methods, the quantitative experimental results demonstrate that our method can achieve better classification performance and higher computational efficiency in terms of ROC, AUC, G-mean, and F-measure metrics. CONCLUSIONS: Our study provides an effective solution for imbalanced biomedical data classification under the condition of limited samples and high-dimensional feature. The proposed method could offer a theoretical basis for computer-aided diagnosis. It has the potential to be applied in biomedical clinical practice.
format Online
Article
Text
id pubmed-6280414
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62804142018-12-10 Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN Zhang, Liyuan Yang, Huamin Jiang, Zhengang Biomed Eng Online Research BACKGROUND: Imbalanced data classification is an inevitable problem in medical intelligent diagnosis. Most of real-world biomedical datasets are usually along with limited samples and high-dimensional feature. This seriously affects the classification performance of the model and causes erroneous guidance for the diagnosis of diseases. Exploring an effective classification method for imbalanced and limited biomedical dataset is a challenging task. METHODS: In this paper, we propose a novel multilayer extreme learning machine (ELM) classification model combined with dynamic generative adversarial net (GAN) to tackle limited and imbalanced biomedical data. Firstly, principal component analysis is utilized to remove irrelevant and redundant features. Meanwhile, more meaningful pathological features are extracted. After that, dynamic GAN is designed to generate the realistic-looking minority class samples, thereby balancing the class distribution and avoiding overfitting effectively. Finally, a self-adaptive multilayer ELM is proposed to classify the balanced dataset. The analytic expression for the numbers of hidden layer and node is determined by quantitatively establishing the relationship between the change of imbalance ratio and the hyper-parameters of the model. Reducing interactive parameters adjustment makes the classification model more robust. RESULTS: To evaluate the classification performance of the proposed method, numerical experiments are conducted on four real-world biomedical datasets. The proposed method can generate authentic minority class samples and self-adaptively select the optimal parameters of learning model. By comparing with W-ELM, SMOTE-ELM, and H-ELM methods, the quantitative experimental results demonstrate that our method can achieve better classification performance and higher computational efficiency in terms of ROC, AUC, G-mean, and F-measure metrics. CONCLUSIONS: Our study provides an effective solution for imbalanced biomedical data classification under the condition of limited samples and high-dimensional feature. The proposed method could offer a theoretical basis for computer-aided diagnosis. It has the potential to be applied in biomedical clinical practice. BioMed Central 2018-12-04 /pmc/articles/PMC6280414/ /pubmed/30514298 http://dx.doi.org/10.1186/s12938-018-0604-3 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Zhang, Liyuan
Yang, Huamin
Jiang, Zhengang
Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN
title Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN
title_full Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN
title_fullStr Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN
title_full_unstemmed Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN
title_short Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN
title_sort imbalanced biomedical data classification using self-adaptive multilayer elm combined with dynamic gan
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6280414/
https://www.ncbi.nlm.nih.gov/pubmed/30514298
http://dx.doi.org/10.1186/s12938-018-0604-3
work_keys_str_mv AT zhangliyuan imbalancedbiomedicaldataclassificationusingselfadaptivemultilayerelmcombinedwithdynamicgan
AT yanghuamin imbalancedbiomedicaldataclassificationusingselfadaptivemultilayerelmcombinedwithdynamicgan
AT jiangzhengang imbalancedbiomedicaldataclassificationusingselfadaptivemultilayerelmcombinedwithdynamicgan