Cargando…
Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset
Data-driven methods have prominently featured in the progressive research and development of modern condition monitoring systems for electrical machines. These methods have the advantage of simplicity when it comes to the implementation of effective fault detection and diagnostic systems. Despite th...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9099503/ https://www.ncbi.nlm.nih.gov/pubmed/35590937 http://dx.doi.org/10.3390/s22093246 |
_version_ | 1784706621865197568 |
---|---|
author | Swana, Elsie Fezeka Doorsamy, Wesley Bokoro, Pitshou |
author_facet | Swana, Elsie Fezeka Doorsamy, Wesley Bokoro, Pitshou |
author_sort | Swana, Elsie Fezeka |
collection | PubMed |
description | Data-driven methods have prominently featured in the progressive research and development of modern condition monitoring systems for electrical machines. These methods have the advantage of simplicity when it comes to the implementation of effective fault detection and diagnostic systems. Despite their many advantages, the practical implementation of data-driven approaches still faces challenges such as data imbalance. The lack of sufficient and reliable labeled fault data from machines in the field often poses a challenge in developing accurate supervised learning-based condition monitoring systems. This research investigates the use of a Naïve Bayes classifier, support vector machine, and k-nearest neighbors together with synthetic minority oversampling technique, Tomek link, and the combination of these two resampling techniques for fault classification with simulation and experimental imbalanced data. A comparative analysis of these techniques is conducted for different imbalanced data cases to determine the suitability thereof for condition monitoring on a wound-rotor induction generator. The precision, recall, and f1-score matrices are applied for performance evaluation. The results indicate that the technique combining the synthetic minority oversampling technique with the Tomek link provides the best performance across all tested classifiers. The k-nearest neighbors, together with this combination resampling technique yielded the most accurate classification results. This research is of interest to researchers and practitioners working in the area of condition monitoring in electrical machines, and the findings and presented approach of the comparative analysis will assist with the selection of the most suitable technique for handling imbalanced fault data. This is especially important in the practice of condition monitoring on electrical rotating machines, where fault data are very limited. |
format | Online Article Text |
id | pubmed-9099503 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-90995032022-05-14 Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset Swana, Elsie Fezeka Doorsamy, Wesley Bokoro, Pitshou Sensors (Basel) Article Data-driven methods have prominently featured in the progressive research and development of modern condition monitoring systems for electrical machines. These methods have the advantage of simplicity when it comes to the implementation of effective fault detection and diagnostic systems. Despite their many advantages, the practical implementation of data-driven approaches still faces challenges such as data imbalance. The lack of sufficient and reliable labeled fault data from machines in the field often poses a challenge in developing accurate supervised learning-based condition monitoring systems. This research investigates the use of a Naïve Bayes classifier, support vector machine, and k-nearest neighbors together with synthetic minority oversampling technique, Tomek link, and the combination of these two resampling techniques for fault classification with simulation and experimental imbalanced data. A comparative analysis of these techniques is conducted for different imbalanced data cases to determine the suitability thereof for condition monitoring on a wound-rotor induction generator. The precision, recall, and f1-score matrices are applied for performance evaluation. The results indicate that the technique combining the synthetic minority oversampling technique with the Tomek link provides the best performance across all tested classifiers. The k-nearest neighbors, together with this combination resampling technique yielded the most accurate classification results. This research is of interest to researchers and practitioners working in the area of condition monitoring in electrical machines, and the findings and presented approach of the comparative analysis will assist with the selection of the most suitable technique for handling imbalanced fault data. This is especially important in the practice of condition monitoring on electrical rotating machines, where fault data are very limited. MDPI 2022-04-23 /pmc/articles/PMC9099503/ /pubmed/35590937 http://dx.doi.org/10.3390/s22093246 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Swana, Elsie Fezeka Doorsamy, Wesley Bokoro, Pitshou Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset |
title | Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset |
title_full | Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset |
title_fullStr | Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset |
title_full_unstemmed | Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset |
title_short | Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset |
title_sort | tomek link and smote approaches for machine fault classification with an imbalanced dataset |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9099503/ https://www.ncbi.nlm.nih.gov/pubmed/35590937 http://dx.doi.org/10.3390/s22093246 |
work_keys_str_mv | AT swanaelsiefezeka tomeklinkandsmoteapproachesformachinefaultclassificationwithanimbalanceddataset AT doorsamywesley tomeklinkandsmoteapproachesformachinefaultclassificationwithanimbalanceddataset AT bokoropitshou tomeklinkandsmoteapproachesformachinefaultclassificationwithanimbalanceddataset |