Cargando…

An Integrated Framework for Data Quality Fusion in Embedded Sensor Systems

The advancement of embedded sensor systems allowed the monitoring of complex processes based on connected devices. As more and more data are produced by these sensor systems, and as the data are used in increasingly vital areas of applications, it is of growing importance to also track the data qual...

Descripción completa

Detalles Bibliográficos
Autores principales: Scholl, Christoph, Spiegler, Maximilian, Ludwig, Klaus, Eskofier, Bjoern M., Tobola, Andreas, Zanca, Dario
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10140861/
https://www.ncbi.nlm.nih.gov/pubmed/37112142
http://dx.doi.org/10.3390/s23083798
Descripción
Sumario:The advancement of embedded sensor systems allowed the monitoring of complex processes based on connected devices. As more and more data are produced by these sensor systems, and as the data are used in increasingly vital areas of applications, it is of growing importance to also track the data quality of these systems. We propose a framework to fuse sensor data streams and associated data quality attributes into a single meaningful and interpretable value that represents the current underlying data quality. Based on the definition of data quality attributes and metrics to determine real-valued figures representing the quality of the attributes, the fusion algorithms are engineered. Methods based on maximum likelihood estimation (MLE) and fuzzy logic are used to perform data quality fusion by utilizing domain knowledge and sensor measurements. Two data sets are used to verify the proposed fusion framework. First, the methods are applied to a proprietary data set targeting sample rate inaccuracies of a micro-electro-mechanical system (MEMS) accelerometer and second, to the publicly available Intel Lab Data set. The algorithms are verified against their expected behavior based on data exploration and correlation analysis. We prove that both fusion approaches are capable of detecting data quality issues and providing an interpretable data quality indicator.