Cargando…

A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings

BACKGROUND: In high-dimensional data analysis, the complexity of predictive models can be reduced by selecting the most relevant features, which is crucial to reduce data noise and increase model accuracy and interpretability. Thus, in the field of clinical decision making, only the most relevant fe...

Descripción completa

Detalles Bibliográficos
Autores principales: Michel, Pierre, Ngo, Nicolas, Pons, Jean-François, Delliaux, Stéphane, Giorgi, Roch
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8094578/
https://www.ncbi.nlm.nih.gov/pubmed/33947379
http://dx.doi.org/10.1186/s12911-021-01427-8
_version_ 1783687996819636224
author Michel, Pierre
Ngo, Nicolas
Pons, Jean-François
Delliaux, Stéphane
Giorgi, Roch
author_facet Michel, Pierre
Ngo, Nicolas
Pons, Jean-François
Delliaux, Stéphane
Giorgi, Roch
author_sort Michel, Pierre
collection PubMed
description BACKGROUND: In high-dimensional data analysis, the complexity of predictive models can be reduced by selecting the most relevant features, which is crucial to reduce data noise and increase model accuracy and interpretability. Thus, in the field of clinical decision making, only the most relevant features from a set of medical descriptors should be considered when determining whether a patient is healthy or not. This statistical approach known as feature selection can be performed through regression or classification, in a supervised or unsupervised manner. Several feature selection approaches using different mathematical concepts have been described in the literature. In the field of classification, a new approach has recently been proposed that uses the [Formula: see text] -metric, an index measuring separability between different classes in heart rhythm characterization. The present study proposes a filter approach for feature selection in classification using this [Formula: see text] -metric, and evaluates its application to automatic atrial fibrillation detection. METHODS: The stability and prediction performance of the [Formula: see text] -metric feature selection approach was evaluated using the support vector machine model on two heart rhythm datasets, one extracted from the PhysioNet database and the other from the database of Marseille University Hospital Center, France (Timone Hospital). Both datasets contained electrocardiogram recordings grouped into two classes: normal sinus rhythm and atrial fibrillation. The performance of this feature selection approach was compared to that of three other approaches, with the first two based on the Random Forest technique and the other on receiver operating characteristic curve analysis. RESULTS: The [Formula: see text] -metric approach showed satisfactory results, especially for models with a smaller number of features. For the training dataset, all prediction indicators were higher for our approach (accuracy greater than 99% for models with 5 to 17 features), as was stability (greater than 0.925 regardless of the number of features included in the model). For the validation dataset, the features selected with the [Formula: see text] -metric approach differed from those selected with the other approaches; sensitivity was higher for our approach, but other indicators were similar. CONCLUSION: This filter approach for feature selection in classification opens up new methodological avenues for atrial fibrillation detection using short electrocardiogram recordings. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-021-01427-8.
format Online
Article
Text
id pubmed-8094578
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-80945782021-05-05 A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings Michel, Pierre Ngo, Nicolas Pons, Jean-François Delliaux, Stéphane Giorgi, Roch BMC Med Inform Decis Mak Research BACKGROUND: In high-dimensional data analysis, the complexity of predictive models can be reduced by selecting the most relevant features, which is crucial to reduce data noise and increase model accuracy and interpretability. Thus, in the field of clinical decision making, only the most relevant features from a set of medical descriptors should be considered when determining whether a patient is healthy or not. This statistical approach known as feature selection can be performed through regression or classification, in a supervised or unsupervised manner. Several feature selection approaches using different mathematical concepts have been described in the literature. In the field of classification, a new approach has recently been proposed that uses the [Formula: see text] -metric, an index measuring separability between different classes in heart rhythm characterization. The present study proposes a filter approach for feature selection in classification using this [Formula: see text] -metric, and evaluates its application to automatic atrial fibrillation detection. METHODS: The stability and prediction performance of the [Formula: see text] -metric feature selection approach was evaluated using the support vector machine model on two heart rhythm datasets, one extracted from the PhysioNet database and the other from the database of Marseille University Hospital Center, France (Timone Hospital). Both datasets contained electrocardiogram recordings grouped into two classes: normal sinus rhythm and atrial fibrillation. The performance of this feature selection approach was compared to that of three other approaches, with the first two based on the Random Forest technique and the other on receiver operating characteristic curve analysis. RESULTS: The [Formula: see text] -metric approach showed satisfactory results, especially for models with a smaller number of features. For the training dataset, all prediction indicators were higher for our approach (accuracy greater than 99% for models with 5 to 17 features), as was stability (greater than 0.925 regardless of the number of features included in the model). For the validation dataset, the features selected with the [Formula: see text] -metric approach differed from those selected with the other approaches; sensitivity was higher for our approach, but other indicators were similar. CONCLUSION: This filter approach for feature selection in classification opens up new methodological avenues for atrial fibrillation detection using short electrocardiogram recordings. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-021-01427-8. BioMed Central 2021-05-04 /pmc/articles/PMC8094578/ /pubmed/33947379 http://dx.doi.org/10.1186/s12911-021-01427-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Michel, Pierre
Ngo, Nicolas
Pons, Jean-François
Delliaux, Stéphane
Giorgi, Roch
A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings
title A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings
title_full A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings
title_fullStr A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings
title_full_unstemmed A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings
title_short A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings
title_sort filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8094578/
https://www.ncbi.nlm.nih.gov/pubmed/33947379
http://dx.doi.org/10.1186/s12911-021-01427-8
work_keys_str_mv AT michelpierre afilterapproachforfeatureselectioninclassificationapplicationtoautomaticatrialfibrillationdetectioninelectrocardiogramrecordings
AT ngonicolas afilterapproachforfeatureselectioninclassificationapplicationtoautomaticatrialfibrillationdetectioninelectrocardiogramrecordings
AT ponsjeanfrancois afilterapproachforfeatureselectioninclassificationapplicationtoautomaticatrialfibrillationdetectioninelectrocardiogramrecordings
AT delliauxstephane afilterapproachforfeatureselectioninclassificationapplicationtoautomaticatrialfibrillationdetectioninelectrocardiogramrecordings
AT giorgiroch afilterapproachforfeatureselectioninclassificationapplicationtoautomaticatrialfibrillationdetectioninelectrocardiogramrecordings
AT michelpierre filterapproachforfeatureselectioninclassificationapplicationtoautomaticatrialfibrillationdetectioninelectrocardiogramrecordings
AT ngonicolas filterapproachforfeatureselectioninclassificationapplicationtoautomaticatrialfibrillationdetectioninelectrocardiogramrecordings
AT ponsjeanfrancois filterapproachforfeatureselectioninclassificationapplicationtoautomaticatrialfibrillationdetectioninelectrocardiogramrecordings
AT delliauxstephane filterapproachforfeatureselectioninclassificationapplicationtoautomaticatrialfibrillationdetectioninelectrocardiogramrecordings
AT giorgiroch filterapproachforfeatureselectioninclassificationapplicationtoautomaticatrialfibrillationdetectioninelectrocardiogramrecordings