Cargando…

Combination of Feature Selection and Resampling Methods to Predict Preterm Birth Based on Electrohysterographic Signals from Imbalance Data

Due to its high sensitivity, electrohysterography (EHG) has emerged as an alternative technique for predicting preterm labor. The main obstacle in designing preterm labor prediction models is the inherent preterm/term imbalance ratio, which can give rise to relatively low performance. Numerous studi...

Descripción completa

Detalles Bibliográficos
Autores principales: Nieto-del-Amor, Félix, Prats-Boluda, Gema, Garcia-Casado, Javier, Diaz-Martinez, Alba, Diago-Almela, Vicente Jose, Monfort-Ortiz, Rogelio, Hao, Dongmei, Ye-Lin, Yiyao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9319575/
https://www.ncbi.nlm.nih.gov/pubmed/35890778
http://dx.doi.org/10.3390/s22145098
_version_ 1784755582887002112
author Nieto-del-Amor, Félix
Prats-Boluda, Gema
Garcia-Casado, Javier
Diaz-Martinez, Alba
Diago-Almela, Vicente Jose
Monfort-Ortiz, Rogelio
Hao, Dongmei
Ye-Lin, Yiyao
author_facet Nieto-del-Amor, Félix
Prats-Boluda, Gema
Garcia-Casado, Javier
Diaz-Martinez, Alba
Diago-Almela, Vicente Jose
Monfort-Ortiz, Rogelio
Hao, Dongmei
Ye-Lin, Yiyao
author_sort Nieto-del-Amor, Félix
collection PubMed
description Due to its high sensitivity, electrohysterography (EHG) has emerged as an alternative technique for predicting preterm labor. The main obstacle in designing preterm labor prediction models is the inherent preterm/term imbalance ratio, which can give rise to relatively low performance. Numerous studies obtained promising preterm labor prediction results using the synthetic minority oversampling technique. However, these studies generally overestimate mathematical models’ real generalization capacity by generating synthetic data before splitting the dataset, leaking information between the training and testing partitions and thus reducing the complexity of the classification task. In this work, we analyzed the effect of combining feature selection and resampling methods to overcome the class imbalance problem for predicting preterm labor by EHG. We assessed undersampling, oversampling, and hybrid methods applied to the training and validation dataset during feature selection by genetic algorithm, and analyzed the resampling effect on training data after obtaining the optimized feature subset. The best strategy consisted of undersampling the majority class of the validation dataset to 1:1 during feature selection, without subsequent resampling of the training data, achieving an AUC of 94.5 ± 4.6%, average precision of 84.5 ± 11.7%, maximum F1-score of 79.6 ± 13.8%, and recall of 89.8 ± 12.1%. Our results outperformed the techniques currently used in clinical practice, suggesting the EHG could be used to predict preterm labor in clinics.
format Online
Article
Text
id pubmed-9319575
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-93195752022-07-27 Combination of Feature Selection and Resampling Methods to Predict Preterm Birth Based on Electrohysterographic Signals from Imbalance Data Nieto-del-Amor, Félix Prats-Boluda, Gema Garcia-Casado, Javier Diaz-Martinez, Alba Diago-Almela, Vicente Jose Monfort-Ortiz, Rogelio Hao, Dongmei Ye-Lin, Yiyao Sensors (Basel) Article Due to its high sensitivity, electrohysterography (EHG) has emerged as an alternative technique for predicting preterm labor. The main obstacle in designing preterm labor prediction models is the inherent preterm/term imbalance ratio, which can give rise to relatively low performance. Numerous studies obtained promising preterm labor prediction results using the synthetic minority oversampling technique. However, these studies generally overestimate mathematical models’ real generalization capacity by generating synthetic data before splitting the dataset, leaking information between the training and testing partitions and thus reducing the complexity of the classification task. In this work, we analyzed the effect of combining feature selection and resampling methods to overcome the class imbalance problem for predicting preterm labor by EHG. We assessed undersampling, oversampling, and hybrid methods applied to the training and validation dataset during feature selection by genetic algorithm, and analyzed the resampling effect on training data after obtaining the optimized feature subset. The best strategy consisted of undersampling the majority class of the validation dataset to 1:1 during feature selection, without subsequent resampling of the training data, achieving an AUC of 94.5 ± 4.6%, average precision of 84.5 ± 11.7%, maximum F1-score of 79.6 ± 13.8%, and recall of 89.8 ± 12.1%. Our results outperformed the techniques currently used in clinical practice, suggesting the EHG could be used to predict preterm labor in clinics. MDPI 2022-07-07 /pmc/articles/PMC9319575/ /pubmed/35890778 http://dx.doi.org/10.3390/s22145098 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Nieto-del-Amor, Félix
Prats-Boluda, Gema
Garcia-Casado, Javier
Diaz-Martinez, Alba
Diago-Almela, Vicente Jose
Monfort-Ortiz, Rogelio
Hao, Dongmei
Ye-Lin, Yiyao
Combination of Feature Selection and Resampling Methods to Predict Preterm Birth Based on Electrohysterographic Signals from Imbalance Data
title Combination of Feature Selection and Resampling Methods to Predict Preterm Birth Based on Electrohysterographic Signals from Imbalance Data
title_full Combination of Feature Selection and Resampling Methods to Predict Preterm Birth Based on Electrohysterographic Signals from Imbalance Data
title_fullStr Combination of Feature Selection and Resampling Methods to Predict Preterm Birth Based on Electrohysterographic Signals from Imbalance Data
title_full_unstemmed Combination of Feature Selection and Resampling Methods to Predict Preterm Birth Based on Electrohysterographic Signals from Imbalance Data
title_short Combination of Feature Selection and Resampling Methods to Predict Preterm Birth Based on Electrohysterographic Signals from Imbalance Data
title_sort combination of feature selection and resampling methods to predict preterm birth based on electrohysterographic signals from imbalance data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9319575/
https://www.ncbi.nlm.nih.gov/pubmed/35890778
http://dx.doi.org/10.3390/s22145098
work_keys_str_mv AT nietodelamorfelix combinationoffeatureselectionandresamplingmethodstopredictpretermbirthbasedonelectrohysterographicsignalsfromimbalancedata
AT pratsboludagema combinationoffeatureselectionandresamplingmethodstopredictpretermbirthbasedonelectrohysterographicsignalsfromimbalancedata
AT garciacasadojavier combinationoffeatureselectionandresamplingmethodstopredictpretermbirthbasedonelectrohysterographicsignalsfromimbalancedata
AT diazmartinezalba combinationoffeatureselectionandresamplingmethodstopredictpretermbirthbasedonelectrohysterographicsignalsfromimbalancedata
AT diagoalmelavicentejose combinationoffeatureselectionandresamplingmethodstopredictpretermbirthbasedonelectrohysterographicsignalsfromimbalancedata
AT monfortortizrogelio combinationoffeatureselectionandresamplingmethodstopredictpretermbirthbasedonelectrohysterographicsignalsfromimbalancedata
AT haodongmei combinationoffeatureselectionandresamplingmethodstopredictpretermbirthbasedonelectrohysterographicsignalsfromimbalancedata
AT yelinyiyao combinationoffeatureselectionandresamplingmethodstopredictpretermbirthbasedonelectrohysterographicsignalsfromimbalancedata