Cargando…

A novel combined dynamic ensemble selection model for imbalanced data to detect COVID-19 from complete blood count

BACKGROUND: As blood testing is radiation-free, low-cost and simple to operate, some researchers use machine learning to detect COVID-19 from blood test data. However, few studies take into consideration the imbalanced data distribution, which can impair the performance of a classifier. METHOD: A no...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jiachao, Shen, Jiang, Xu, Man, Shao, Minglai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier B.V. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8479386/
https://www.ncbi.nlm.nih.gov/pubmed/34614451
http://dx.doi.org/10.1016/j.cmpb.2021.106444
Descripción
Sumario:BACKGROUND: As blood testing is radiation-free, low-cost and simple to operate, some researchers use machine learning to detect COVID-19 from blood test data. However, few studies take into consideration the imbalanced data distribution, which can impair the performance of a classifier. METHOD: A novel combined dynamic ensemble selection (DES) method is proposed for imbalanced data to detect COVID-19 from complete blood count. This method combines data preprocessing and improved DES. Firstly, we use the hybrid synthetic minority over-sampling technique and edited nearest neighbor (SMOTE-ENN) to balance data and remove noise. Secondly, in order to improve the performance of DES, a novel hybrid multiple clustering and bagging classifier generation (HMCBCG) method is proposed to reinforce the diversity and local regional competence of candidate classifiers. RESULTS: The experimental results based on three popular DES methods show that the performance of HMCBCG is better than only use bagging. HMCBCG+KNE obtains the best performance for COVID-19 screening with 99.81% accuracy, 99.86% F1, 99.78% G-mean and 99.81% AUC. CONCLUSION: Compared to other advanced methods, our combined DES model can improve accuracy, G-mean, F1 and AUC of COVID-19 screening.