Cargando…
A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge
Continuously acquired biosignals from patient monitors contain significant amounts of unusable data. During the development of a decision support system based on continuously acquired biosignals, we developed machine and deep learning algorithms to automatically classify the quality of ECG data. A t...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10046279/ https://www.ncbi.nlm.nih.gov/pubmed/36975349 http://dx.doi.org/10.3390/biomimetics8010119 |
_version_ | 1785013633066991616 |
---|---|
author | Kim, Hyun Joo Venkat, S. Jayakumar Chang, Hyoung Woo Cho, Yang Hyun Lee, Jee Yang Koo, Kyunghee |
author_facet | Kim, Hyun Joo Venkat, S. Jayakumar Chang, Hyoung Woo Cho, Yang Hyun Lee, Jee Yang Koo, Kyunghee |
author_sort | Kim, Hyun Joo |
collection | PubMed |
description | Continuously acquired biosignals from patient monitors contain significant amounts of unusable data. During the development of a decision support system based on continuously acquired biosignals, we developed machine and deep learning algorithms to automatically classify the quality of ECG data. A total of 31,127 twenty-s ECG segments of 250 Hz were used as the training/validation dataset. Data quality was categorized into three classes: acceptable, unacceptable, and uncertain. In the training/validation dataset, 29,606 segments (95%) were in the acceptable class. Two one-step, three-class approaches and two two-step binary sequential approaches were developed using random forest (RF) and two-dimensional convolutional neural network (2D CNN) classifiers. Four approaches were tested on 9779 test samples from another hospital. On the test dataset, the two-step 2D CNN approach showed the best overall accuracy (0.85), and the one-step, three-class 2D CNN approach showed the worst overall accuracy (0.54). The most important parameter, precision in the acceptable class, was greater than 0.9 for all approaches, but recall in the acceptable class was better for the two-step approaches: one-step (0.77) vs. two-step RF (0.89) and one-step (0.51) vs. two-step 2D CNN (0.94) (p < 0.001 for both comparisons). For the ECG quality classification, where substantial data imbalance exists, the 2-step approaches showed more robust performance than the one-step approach. This algorithm can be used as a preprocessing step in artificial intelligence research using continuously acquired biosignals. |
format | Online Article Text |
id | pubmed-10046279 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100462792023-03-29 A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge Kim, Hyun Joo Venkat, S. Jayakumar Chang, Hyoung Woo Cho, Yang Hyun Lee, Jee Yang Koo, Kyunghee Biomimetics (Basel) Article Continuously acquired biosignals from patient monitors contain significant amounts of unusable data. During the development of a decision support system based on continuously acquired biosignals, we developed machine and deep learning algorithms to automatically classify the quality of ECG data. A total of 31,127 twenty-s ECG segments of 250 Hz were used as the training/validation dataset. Data quality was categorized into three classes: acceptable, unacceptable, and uncertain. In the training/validation dataset, 29,606 segments (95%) were in the acceptable class. Two one-step, three-class approaches and two two-step binary sequential approaches were developed using random forest (RF) and two-dimensional convolutional neural network (2D CNN) classifiers. Four approaches were tested on 9779 test samples from another hospital. On the test dataset, the two-step 2D CNN approach showed the best overall accuracy (0.85), and the one-step, three-class 2D CNN approach showed the worst overall accuracy (0.54). The most important parameter, precision in the acceptable class, was greater than 0.9 for all approaches, but recall in the acceptable class was better for the two-step approaches: one-step (0.77) vs. two-step RF (0.89) and one-step (0.51) vs. two-step 2D CNN (0.94) (p < 0.001 for both comparisons). For the ECG quality classification, where substantial data imbalance exists, the 2-step approaches showed more robust performance than the one-step approach. This algorithm can be used as a preprocessing step in artificial intelligence research using continuously acquired biosignals. MDPI 2023-03-13 /pmc/articles/PMC10046279/ /pubmed/36975349 http://dx.doi.org/10.3390/biomimetics8010119 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Kim, Hyun Joo Venkat, S. Jayakumar Chang, Hyoung Woo Cho, Yang Hyun Lee, Jee Yang Koo, Kyunghee A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge |
title | A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge |
title_full | A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge |
title_fullStr | A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge |
title_full_unstemmed | A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge |
title_short | A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge |
title_sort | two-step approach to overcoming data imbalance in the development of an electrocardiography data quality assessment algorithm: a real-world data challenge |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10046279/ https://www.ncbi.nlm.nih.gov/pubmed/36975349 http://dx.doi.org/10.3390/biomimetics8010119 |
work_keys_str_mv | AT kimhyunjoo atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT venkatsjayakumar atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT changhyoungwoo atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT choyanghyun atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT leejeeyang atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT kookyunghee atwostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT kimhyunjoo twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT venkatsjayakumar twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT changhyoungwoo twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT choyanghyun twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT leejeeyang twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge AT kookyunghee twostepapproachtoovercomingdataimbalanceinthedevelopmentofanelectrocardiographydataqualityassessmentalgorithmarealworlddatachallenge |