Cargando…

Pruning-based oversampling technique with smoothed bootstrap resampling for imbalanced clinical dataset of Covid-19

The Coronavirus Disease (COVID-19) was declared a pandemic disease by the World Health Organization (WHO), and it has not ended so far. Since the infection rate of the COVID-19 increases, the computational approach is needed to predict patients infected with COVID-19 in order to speed up the diagnos...

Descripción completa

Detalles Bibliográficos
Autores principales: Wibowo, Prasetyo, Fatichah, Chastine
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Authors. Published by Elsevier B.V. on behalf of King Saud University. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8482553/
http://dx.doi.org/10.1016/j.jksuci.2021.09.021
_version_ 1784576931804479488
author Wibowo, Prasetyo
Fatichah, Chastine
author_facet Wibowo, Prasetyo
Fatichah, Chastine
author_sort Wibowo, Prasetyo
collection PubMed
description The Coronavirus Disease (COVID-19) was declared a pandemic disease by the World Health Organization (WHO), and it has not ended so far. Since the infection rate of the COVID-19 increases, the computational approach is needed to predict patients infected with COVID-19 in order to speed up the diagnosis time and minimize human error compared to conventional diagnoses. However, the number of negative data that is higher than positive data can result in a data imbalance situation that affects the classification performance, resulting in a bias in the model evaluation results. This study proposes a new oversampling technique, i.e., TRIM-SBR, to generate the minor class data for diagnosing patients infected with COVID-19. It is still challenging to develop the oversampling technique due to the data’s generalization issue. The proposed method is based on pruning by looking for specific minority areas while retaining data generalization, resulting in minority data seeds that serve as benchmarks in creating new synthesized data using bootstrap resampling techniques. Accuracy, Specificity, Sensitivity, F-measure, and AUC are used to evaluate classifier performance in data imbalance cases. The results show that the TRIM-SBR method provides the best performance compared to other oversampling techniques.
format Online
Article
Text
id pubmed-8482553
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher The Authors. Published by Elsevier B.V. on behalf of King Saud University.
record_format MEDLINE/PubMed
spelling pubmed-84825532021-09-30 Pruning-based oversampling technique with smoothed bootstrap resampling for imbalanced clinical dataset of Covid-19 Wibowo, Prasetyo Fatichah, Chastine Journal of King Saud University - Computer and Information Sciences Article The Coronavirus Disease (COVID-19) was declared a pandemic disease by the World Health Organization (WHO), and it has not ended so far. Since the infection rate of the COVID-19 increases, the computational approach is needed to predict patients infected with COVID-19 in order to speed up the diagnosis time and minimize human error compared to conventional diagnoses. However, the number of negative data that is higher than positive data can result in a data imbalance situation that affects the classification performance, resulting in a bias in the model evaluation results. This study proposes a new oversampling technique, i.e., TRIM-SBR, to generate the minor class data for diagnosing patients infected with COVID-19. It is still challenging to develop the oversampling technique due to the data’s generalization issue. The proposed method is based on pruning by looking for specific minority areas while retaining data generalization, resulting in minority data seeds that serve as benchmarks in creating new synthesized data using bootstrap resampling techniques. Accuracy, Specificity, Sensitivity, F-measure, and AUC are used to evaluate classifier performance in data imbalance cases. The results show that the TRIM-SBR method provides the best performance compared to other oversampling techniques. The Authors. Published by Elsevier B.V. on behalf of King Saud University. 2022-10 2021-09-30 /pmc/articles/PMC8482553/ http://dx.doi.org/10.1016/j.jksuci.2021.09.021 Text en © 2021 The Authors Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Wibowo, Prasetyo
Fatichah, Chastine
Pruning-based oversampling technique with smoothed bootstrap resampling for imbalanced clinical dataset of Covid-19
title Pruning-based oversampling technique with smoothed bootstrap resampling for imbalanced clinical dataset of Covid-19
title_full Pruning-based oversampling technique with smoothed bootstrap resampling for imbalanced clinical dataset of Covid-19
title_fullStr Pruning-based oversampling technique with smoothed bootstrap resampling for imbalanced clinical dataset of Covid-19
title_full_unstemmed Pruning-based oversampling technique with smoothed bootstrap resampling for imbalanced clinical dataset of Covid-19
title_short Pruning-based oversampling technique with smoothed bootstrap resampling for imbalanced clinical dataset of Covid-19
title_sort pruning-based oversampling technique with smoothed bootstrap resampling for imbalanced clinical dataset of covid-19
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8482553/
http://dx.doi.org/10.1016/j.jksuci.2021.09.021
work_keys_str_mv AT wibowoprasetyo pruningbasedoversamplingtechniquewithsmoothedbootstrapresamplingforimbalancedclinicaldatasetofcovid19
AT fatichahchastine pruningbasedoversamplingtechniquewithsmoothedbootstrapresamplingforimbalancedclinicaldatasetofcovid19