Cargando…

LCSS-Based Algorithm for Computing Multivariate Data Set Similarity: A Case Study of Real-Time WSN Data

Multivariate data sets are common in various application areas, such as wireless sensor networks (WSNs) and DNA analysis. A robust mechanism is required to compute their similarity indexes regardless of the environment and problem domain. This study describes the usefulness of a non-metric-based app...

Descripción completa

Detalles Bibliográficos
Autores principales: Khan, Rahim, Ali, Ihsan, Altowaijri, Saleh M., Zakarya, Muhammad, Ur Rahman, Atiq, Ahmedy, Ismail, Khan, Anwar, Gani, Abdullah
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6339076/
https://www.ncbi.nlm.nih.gov/pubmed/30621241
http://dx.doi.org/10.3390/s19010166
_version_ 1783388554477436928
author Khan, Rahim
Ali, Ihsan
Altowaijri, Saleh M.
Zakarya, Muhammad
Ur Rahman, Atiq
Ahmedy, Ismail
Khan, Anwar
Gani, Abdullah
author_facet Khan, Rahim
Ali, Ihsan
Altowaijri, Saleh M.
Zakarya, Muhammad
Ur Rahman, Atiq
Ahmedy, Ismail
Khan, Anwar
Gani, Abdullah
author_sort Khan, Rahim
collection PubMed
description Multivariate data sets are common in various application areas, such as wireless sensor networks (WSNs) and DNA analysis. A robust mechanism is required to compute their similarity indexes regardless of the environment and problem domain. This study describes the usefulness of a non-metric-based approach (i.e., longest common subsequence) in computing similarity indexes. Several non-metric-based algorithms are available in the literature, the most robust and reliable one is the dynamic programming-based technique. However, dynamic programming-based techniques are considered inefficient, particularly in the context of multivariate data sets. Furthermore, the classical approaches are not powerful enough in scenarios with multivariate data sets, sensor data or when the similarity indexes are extremely high or low. To address this issue, we propose an efficient algorithm to measure the similarity indexes of multivariate data sets using a non-metric-based methodology. The proposed algorithm performs exceptionally well on numerous multivariate data sets compared with the classical dynamic programming-based algorithms. The performance of the algorithms is evaluated on the basis of several benchmark data sets and a dynamic multivariate data set, which is obtained from a WSN deployed in the Ghulam Ishaq Khan (GIK) Institute of Engineering Sciences and Technology. Our evaluation suggests that the proposed algorithm can be approximately 39.9% more efficient than its counterparts for various data sets in terms of computational time.
format Online
Article
Text
id pubmed-6339076
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-63390762019-01-23 LCSS-Based Algorithm for Computing Multivariate Data Set Similarity: A Case Study of Real-Time WSN Data Khan, Rahim Ali, Ihsan Altowaijri, Saleh M. Zakarya, Muhammad Ur Rahman, Atiq Ahmedy, Ismail Khan, Anwar Gani, Abdullah Sensors (Basel) Article Multivariate data sets are common in various application areas, such as wireless sensor networks (WSNs) and DNA analysis. A robust mechanism is required to compute their similarity indexes regardless of the environment and problem domain. This study describes the usefulness of a non-metric-based approach (i.e., longest common subsequence) in computing similarity indexes. Several non-metric-based algorithms are available in the literature, the most robust and reliable one is the dynamic programming-based technique. However, dynamic programming-based techniques are considered inefficient, particularly in the context of multivariate data sets. Furthermore, the classical approaches are not powerful enough in scenarios with multivariate data sets, sensor data or when the similarity indexes are extremely high or low. To address this issue, we propose an efficient algorithm to measure the similarity indexes of multivariate data sets using a non-metric-based methodology. The proposed algorithm performs exceptionally well on numerous multivariate data sets compared with the classical dynamic programming-based algorithms. The performance of the algorithms is evaluated on the basis of several benchmark data sets and a dynamic multivariate data set, which is obtained from a WSN deployed in the Ghulam Ishaq Khan (GIK) Institute of Engineering Sciences and Technology. Our evaluation suggests that the proposed algorithm can be approximately 39.9% more efficient than its counterparts for various data sets in terms of computational time. MDPI 2019-01-04 /pmc/articles/PMC6339076/ /pubmed/30621241 http://dx.doi.org/10.3390/s19010166 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Khan, Rahim
Ali, Ihsan
Altowaijri, Saleh M.
Zakarya, Muhammad
Ur Rahman, Atiq
Ahmedy, Ismail
Khan, Anwar
Gani, Abdullah
LCSS-Based Algorithm for Computing Multivariate Data Set Similarity: A Case Study of Real-Time WSN Data
title LCSS-Based Algorithm for Computing Multivariate Data Set Similarity: A Case Study of Real-Time WSN Data
title_full LCSS-Based Algorithm for Computing Multivariate Data Set Similarity: A Case Study of Real-Time WSN Data
title_fullStr LCSS-Based Algorithm for Computing Multivariate Data Set Similarity: A Case Study of Real-Time WSN Data
title_full_unstemmed LCSS-Based Algorithm for Computing Multivariate Data Set Similarity: A Case Study of Real-Time WSN Data
title_short LCSS-Based Algorithm for Computing Multivariate Data Set Similarity: A Case Study of Real-Time WSN Data
title_sort lcss-based algorithm for computing multivariate data set similarity: a case study of real-time wsn data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6339076/
https://www.ncbi.nlm.nih.gov/pubmed/30621241
http://dx.doi.org/10.3390/s19010166
work_keys_str_mv AT khanrahim lcssbasedalgorithmforcomputingmultivariatedatasetsimilarityacasestudyofrealtimewsndata
AT aliihsan lcssbasedalgorithmforcomputingmultivariatedatasetsimilarityacasestudyofrealtimewsndata
AT altowaijrisalehm lcssbasedalgorithmforcomputingmultivariatedatasetsimilarityacasestudyofrealtimewsndata
AT zakaryamuhammad lcssbasedalgorithmforcomputingmultivariatedatasetsimilarityacasestudyofrealtimewsndata
AT urrahmanatiq lcssbasedalgorithmforcomputingmultivariatedatasetsimilarityacasestudyofrealtimewsndata
AT ahmedyismail lcssbasedalgorithmforcomputingmultivariatedatasetsimilarityacasestudyofrealtimewsndata
AT khananwar lcssbasedalgorithmforcomputingmultivariatedatasetsimilarityacasestudyofrealtimewsndata
AT ganiabdullah lcssbasedalgorithmforcomputingmultivariatedatasetsimilarityacasestudyofrealtimewsndata