Cargando…

A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge

Continuously acquired biosignals from patient monitors contain significant amounts of unusable data. During the development of a decision support system based on continuously acquired biosignals, we developed machine and deep learning algorithms to automatically classify the quality of ECG data. A t...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Hyun Joo, Venkat, S. Jayakumar, Chang, Hyoung Woo, Cho, Yang Hyun, Lee, Jee Yang, Koo, Kyunghee
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10046279/
https://www.ncbi.nlm.nih.gov/pubmed/36975349
http://dx.doi.org/10.3390/biomimetics8010119
_version_ 1785013633066991616
author Kim, Hyun Joo
Venkat, S. Jayakumar
Chang, Hyoung Woo
Cho, Yang Hyun
Lee, Jee Yang
Koo, Kyunghee
author_facet Kim, Hyun Joo
Venkat, S. Jayakumar
Chang, Hyoung Woo
Cho, Yang Hyun
Lee, Jee Yang
Koo, Kyunghee
author_sort Kim, Hyun Joo
collection PubMed
description Continuously acquired biosignals from patient monitors contain significant amounts of unusable data. During the development of a decision support system based on continuously acquired biosignals, we developed machine and deep learning algorithms to automatically classify the quality of ECG data. A total of 31,127 twenty-s ECG segments of 250 Hz were used as the training/validation dataset. Data quality was categorized into three classes: acceptable, unacceptable, and uncertain. In the training/validation dataset, 29,606 segments (95%) were in the acceptable class. Two one-step, three-class approaches and two two-step binary sequential approaches were developed using random forest (RF) and two-dimensional convolutional neural network (2D CNN) classifiers. Four approaches were tested on 9779 test samples from another hospital. On the test dataset, the two-step 2D CNN approach showed the best overall accuracy (0.85), and the one-step, three-class 2D CNN approach showed the worst overall accuracy (0.54). The most important parameter, precision in the acceptable class, was greater than 0.9 for all approaches, but recall in the acceptable class was better for the two-step approaches: one-step (0.77) vs. two-step RF (0.89) and one-step (0.51) vs. two-step 2D CNN (0.94) (p < 0.001 for both comparisons). For the ECG quality classification, where substantial data imbalance exists, the 2-step approaches showed more robust performance than the one-step approach. This algorithm can be used as a preprocessing step in artificial intelligence research using continuously acquired biosignals.
format Online
Article
Text
id pubmed-10046279
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-100462792023-03-29 A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge Kim, Hyun Joo Venkat, S. Jayakumar Chang, Hyoung Woo Cho, Yang Hyun Lee, Jee Yang Koo, Kyunghee Biomimetics (Basel) Article Continuously acquired biosignals from patient monitors contain significant amounts of unusable data. During the development of a decision support system based on continuously acquired biosignals, we developed machine and deep learning algorithms to automatically classify the quality of ECG data. A total of 31,127 twenty-s ECG segments of 250 Hz were used as the training/validation dataset. Data quality was categorized into three classes: acceptable, unacceptable, and uncertain. In the training/validation dataset, 29,606 segments (95%) were in the acceptable class. Two one-step, three-class approaches and two two-step binary sequential approaches were developed using random forest (RF) and two-dimensional convolutional neural network (2D CNN) classifiers. Four approaches were tested on 9779 test samples from another hospital. On the test dataset, the two-step 2D CNN approach showed the best overall accuracy (0.85), and the one-step, three-class 2D CNN approach showed the worst overall accuracy (0.54). The most important parameter, precision in the acceptable class, was greater than 0.9 for all approaches, but recall in the acceptable class was better for the two-step approaches: one-step (0.77) vs. two-step RF (0.89) and one-step (0.51) vs. two-step 2D CNN (0.94) (p < 0.001 for both comparisons). For the ECG quality classification, where substantial data imbalance exists, the 2-step approaches showed more robust performance than the one-step approach. This algorithm can be used as a preprocessing step in artificial intelligence research using continuously acquired biosignals. MDPI 2023-03-13 /pmc/articles/PMC10046279/ /pubmed/36975349 http://dx.doi.org/10.3390/biomimetics8010119 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Kim, Hyun Joo
Venkat, S. Jayakumar
Chang, Hyoung Woo
Cho, Yang Hyun
Lee, Jee Yang
Koo, Kyunghee
A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge
title A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge
title_full A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge
title_fullStr A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge
title_full_unstemmed A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge
title_short A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge
title_sort two-step approach to overcoming data imbalance in the development of an electrocardiography data quality assessment algorithm: a real-world data challenge
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10046279/
https://www.ncbi.nlm.nih.gov/pubmed/36975349
http://dx.doi.org/10.3390/biomimetics8010119
work_keys_str_mv AT kimhyunjoo atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT venkatsjayakumar atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT changhyoungwoo atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT choyanghyun atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT leejeeyang atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT kookyunghee atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT kimhyunjoo twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT venkatsjayakumar twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT changhyoungwoo twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT choyanghyun twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT leejeeyang twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge
AT kookyunghee twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge