Cargando…
A Hybrid Missing Data Imputation Method for Batch Process Monitoring Dataset
Batch process monitoring datasets usually contain missing data, which decreases the performance of data-driven modeling for fault identification and optimal control. Many methods have been proposed to impute missing data; however, they do not fulfill the need for data quality, especially in sensor d...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10650138/ https://www.ncbi.nlm.nih.gov/pubmed/37960379 http://dx.doi.org/10.3390/s23218678 |
_version_ | 1785135711381356544 |
---|---|
author | Gan, Qihong Gong, Lang Hu, Dasha Jiang, Yuming Ding, Xuefeng |
author_facet | Gan, Qihong Gong, Lang Hu, Dasha Jiang, Yuming Ding, Xuefeng |
author_sort | Gan, Qihong |
collection | PubMed |
description | Batch process monitoring datasets usually contain missing data, which decreases the performance of data-driven modeling for fault identification and optimal control. Many methods have been proposed to impute missing data; however, they do not fulfill the need for data quality, especially in sensor datasets with different types of missing data. We propose a hybrid missing data imputation method for batch process monitoring datasets with multi-type missing data. In this method, the missing data is first classified into five categories based on the continuous missing duration and the number of variables missing simultaneously. Then, different categories of missing data are step-by-step imputed considering their unique characteristics. A combination of three single-dimensional interpolation models is employed to impute transient isolated missing values. An iterative imputation based on a multivariate regression model is designed for imputing long-term missing variables, and a combination model based on single-dimensional interpolation and multivariate regression is proposed for imputing short-term missing variables. The Long Short-Term Memory (LSTM) model is utilized to impute both short-term and long-term missing samples. Finally, a series of experiments for different categories of missing data were conducted based on a real-world batch process monitoring dataset. The results demonstrate that the proposed method achieves higher imputation accuracy than other comparative methods. |
format | Online Article Text |
id | pubmed-10650138 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-106501382023-10-24 A Hybrid Missing Data Imputation Method for Batch Process Monitoring Dataset Gan, Qihong Gong, Lang Hu, Dasha Jiang, Yuming Ding, Xuefeng Sensors (Basel) Article Batch process monitoring datasets usually contain missing data, which decreases the performance of data-driven modeling for fault identification and optimal control. Many methods have been proposed to impute missing data; however, they do not fulfill the need for data quality, especially in sensor datasets with different types of missing data. We propose a hybrid missing data imputation method for batch process monitoring datasets with multi-type missing data. In this method, the missing data is first classified into five categories based on the continuous missing duration and the number of variables missing simultaneously. Then, different categories of missing data are step-by-step imputed considering their unique characteristics. A combination of three single-dimensional interpolation models is employed to impute transient isolated missing values. An iterative imputation based on a multivariate regression model is designed for imputing long-term missing variables, and a combination model based on single-dimensional interpolation and multivariate regression is proposed for imputing short-term missing variables. The Long Short-Term Memory (LSTM) model is utilized to impute both short-term and long-term missing samples. Finally, a series of experiments for different categories of missing data were conducted based on a real-world batch process monitoring dataset. The results demonstrate that the proposed method achieves higher imputation accuracy than other comparative methods. MDPI 2023-10-24 /pmc/articles/PMC10650138/ /pubmed/37960379 http://dx.doi.org/10.3390/s23218678 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Gan, Qihong Gong, Lang Hu, Dasha Jiang, Yuming Ding, Xuefeng A Hybrid Missing Data Imputation Method for Batch Process Monitoring Dataset |
title | A Hybrid Missing Data Imputation Method for Batch Process Monitoring Dataset |
title_full | A Hybrid Missing Data Imputation Method for Batch Process Monitoring Dataset |
title_fullStr | A Hybrid Missing Data Imputation Method for Batch Process Monitoring Dataset |
title_full_unstemmed | A Hybrid Missing Data Imputation Method for Batch Process Monitoring Dataset |
title_short | A Hybrid Missing Data Imputation Method for Batch Process Monitoring Dataset |
title_sort | hybrid missing data imputation method for batch process monitoring dataset |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10650138/ https://www.ncbi.nlm.nih.gov/pubmed/37960379 http://dx.doi.org/10.3390/s23218678 |
work_keys_str_mv | AT ganqihong ahybridmissingdataimputationmethodforbatchprocessmonitoringdataset AT gonglang ahybridmissingdataimputationmethodforbatchprocessmonitoringdataset AT hudasha ahybridmissingdataimputationmethodforbatchprocessmonitoringdataset AT jiangyuming ahybridmissingdataimputationmethodforbatchprocessmonitoringdataset AT dingxuefeng ahybridmissingdataimputationmethodforbatchprocessmonitoringdataset AT ganqihong hybridmissingdataimputationmethodforbatchprocessmonitoringdataset AT gonglang hybridmissingdataimputationmethodforbatchprocessmonitoringdataset AT hudasha hybridmissingdataimputationmethodforbatchprocessmonitoringdataset AT jiangyuming hybridmissingdataimputationmethodforbatchprocessmonitoringdataset AT dingxuefeng hybridmissingdataimputationmethodforbatchprocessmonitoringdataset |