Cargando…
Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN
BACKGROUND: Imbalanced data classification is an inevitable problem in medical intelligent diagnosis. Most of real-world biomedical datasets are usually along with limited samples and high-dimensional feature. This seriously affects the classification performance of the model and causes erroneous gu...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6280414/ https://www.ncbi.nlm.nih.gov/pubmed/30514298 http://dx.doi.org/10.1186/s12938-018-0604-3 |
_version_ | 1783378666682580992 |
---|---|
author | Zhang, Liyuan Yang, Huamin Jiang, Zhengang |
author_facet | Zhang, Liyuan Yang, Huamin Jiang, Zhengang |
author_sort | Zhang, Liyuan |
collection | PubMed |
description | BACKGROUND: Imbalanced data classification is an inevitable problem in medical intelligent diagnosis. Most of real-world biomedical datasets are usually along with limited samples and high-dimensional feature. This seriously affects the classification performance of the model and causes erroneous guidance for the diagnosis of diseases. Exploring an effective classification method for imbalanced and limited biomedical dataset is a challenging task. METHODS: In this paper, we propose a novel multilayer extreme learning machine (ELM) classification model combined with dynamic generative adversarial net (GAN) to tackle limited and imbalanced biomedical data. Firstly, principal component analysis is utilized to remove irrelevant and redundant features. Meanwhile, more meaningful pathological features are extracted. After that, dynamic GAN is designed to generate the realistic-looking minority class samples, thereby balancing the class distribution and avoiding overfitting effectively. Finally, a self-adaptive multilayer ELM is proposed to classify the balanced dataset. The analytic expression for the numbers of hidden layer and node is determined by quantitatively establishing the relationship between the change of imbalance ratio and the hyper-parameters of the model. Reducing interactive parameters adjustment makes the classification model more robust. RESULTS: To evaluate the classification performance of the proposed method, numerical experiments are conducted on four real-world biomedical datasets. The proposed method can generate authentic minority class samples and self-adaptively select the optimal parameters of learning model. By comparing with W-ELM, SMOTE-ELM, and H-ELM methods, the quantitative experimental results demonstrate that our method can achieve better classification performance and higher computational efficiency in terms of ROC, AUC, G-mean, and F-measure metrics. CONCLUSIONS: Our study provides an effective solution for imbalanced biomedical data classification under the condition of limited samples and high-dimensional feature. The proposed method could offer a theoretical basis for computer-aided diagnosis. It has the potential to be applied in biomedical clinical practice. |
format | Online Article Text |
id | pubmed-6280414 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62804142018-12-10 Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN Zhang, Liyuan Yang, Huamin Jiang, Zhengang Biomed Eng Online Research BACKGROUND: Imbalanced data classification is an inevitable problem in medical intelligent diagnosis. Most of real-world biomedical datasets are usually along with limited samples and high-dimensional feature. This seriously affects the classification performance of the model and causes erroneous guidance for the diagnosis of diseases. Exploring an effective classification method for imbalanced and limited biomedical dataset is a challenging task. METHODS: In this paper, we propose a novel multilayer extreme learning machine (ELM) classification model combined with dynamic generative adversarial net (GAN) to tackle limited and imbalanced biomedical data. Firstly, principal component analysis is utilized to remove irrelevant and redundant features. Meanwhile, more meaningful pathological features are extracted. After that, dynamic GAN is designed to generate the realistic-looking minority class samples, thereby balancing the class distribution and avoiding overfitting effectively. Finally, a self-adaptive multilayer ELM is proposed to classify the balanced dataset. The analytic expression for the numbers of hidden layer and node is determined by quantitatively establishing the relationship between the change of imbalance ratio and the hyper-parameters of the model. Reducing interactive parameters adjustment makes the classification model more robust. RESULTS: To evaluate the classification performance of the proposed method, numerical experiments are conducted on four real-world biomedical datasets. The proposed method can generate authentic minority class samples and self-adaptively select the optimal parameters of learning model. By comparing with W-ELM, SMOTE-ELM, and H-ELM methods, the quantitative experimental results demonstrate that our method can achieve better classification performance and higher computational efficiency in terms of ROC, AUC, G-mean, and F-measure metrics. CONCLUSIONS: Our study provides an effective solution for imbalanced biomedical data classification under the condition of limited samples and high-dimensional feature. The proposed method could offer a theoretical basis for computer-aided diagnosis. It has the potential to be applied in biomedical clinical practice. BioMed Central 2018-12-04 /pmc/articles/PMC6280414/ /pubmed/30514298 http://dx.doi.org/10.1186/s12938-018-0604-3 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Zhang, Liyuan Yang, Huamin Jiang, Zhengang Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN |
title | Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN |
title_full | Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN |
title_fullStr | Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN |
title_full_unstemmed | Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN |
title_short | Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN |
title_sort | imbalanced biomedical data classification using self-adaptive multilayer elm combined with dynamic gan |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6280414/ https://www.ncbi.nlm.nih.gov/pubmed/30514298 http://dx.doi.org/10.1186/s12938-018-0604-3 |
work_keys_str_mv | AT zhangliyuan imbalancedbiomedicaldataclassificationusingselfadaptivemultilayerelmcombinedwithdynamicgan AT yanghuamin imbalancedbiomedicaldataclassificationusingselfadaptivemultilayerelmcombinedwithdynamicgan AT jiangzhengang imbalancedbiomedicaldataclassificationusingselfadaptivemultilayerelmcombinedwithdynamicgan |