Cargando…
Tree-Miner: Mining Sequential Patterns from SP-Tree
Data mining is used to extract actionable knowledge from huge amount of raw data. In numerous real life applications, data are stored in sequential form, hence mining sequential patterns has been one of the most popular fields in data mining. Due to its various applications, across the past decades,...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206258/ http://dx.doi.org/10.1007/978-3-030-47436-2_4 |
_version_ | 1783530379613831168 |
---|---|
author | Rizvee, Redwan Ahmed Arefin, Mohammad Fahim Ahmed, Chowdhury Farhan |
author_facet | Rizvee, Redwan Ahmed Arefin, Mohammad Fahim Ahmed, Chowdhury Farhan |
author_sort | Rizvee, Redwan Ahmed |
collection | PubMed |
description | Data mining is used to extract actionable knowledge from huge amount of raw data. In numerous real life applications, data are stored in sequential form, hence mining sequential patterns has been one of the most popular fields in data mining. Due to its various applications, across the past decades, a significant number of literature have addressed this problem and provided elegant solutions. In this paper we propose a novel tree data structure, SP-Tree, to store the sequence database in a new and efficient manner. Additionally, we propose a new mining algorithm Tree-miner to mine sequential patterns from SP-Tree. To further enhance the performance of our algorithm, we incorporate multiple pruning techniques and optimizations. As our SP-Tree stores the complete database, it can also be used for incremental and dynamic databases, tree-structure is particularly advantageous for interactive mining. We demonstrate how our SP-Tree based Tree-miner algorithm significantly outperforms all of the existing state-of-the-art algorithms, across 6 real life datasets. We conclude by discussing the possible extensions of our approach to other related fields of sequential data. |
format | Online Article Text |
id | pubmed-7206258 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-72062582020-05-08 Tree-Miner: Mining Sequential Patterns from SP-Tree Rizvee, Redwan Ahmed Arefin, Mohammad Fahim Ahmed, Chowdhury Farhan Advances in Knowledge Discovery and Data Mining Article Data mining is used to extract actionable knowledge from huge amount of raw data. In numerous real life applications, data are stored in sequential form, hence mining sequential patterns has been one of the most popular fields in data mining. Due to its various applications, across the past decades, a significant number of literature have addressed this problem and provided elegant solutions. In this paper we propose a novel tree data structure, SP-Tree, to store the sequence database in a new and efficient manner. Additionally, we propose a new mining algorithm Tree-miner to mine sequential patterns from SP-Tree. To further enhance the performance of our algorithm, we incorporate multiple pruning techniques and optimizations. As our SP-Tree stores the complete database, it can also be used for incremental and dynamic databases, tree-structure is particularly advantageous for interactive mining. We demonstrate how our SP-Tree based Tree-miner algorithm significantly outperforms all of the existing state-of-the-art algorithms, across 6 real life datasets. We conclude by discussing the possible extensions of our approach to other related fields of sequential data. 2020-04-17 /pmc/articles/PMC7206258/ http://dx.doi.org/10.1007/978-3-030-47436-2_4 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Rizvee, Redwan Ahmed Arefin, Mohammad Fahim Ahmed, Chowdhury Farhan Tree-Miner: Mining Sequential Patterns from SP-Tree |
title | Tree-Miner: Mining Sequential Patterns from SP-Tree |
title_full | Tree-Miner: Mining Sequential Patterns from SP-Tree |
title_fullStr | Tree-Miner: Mining Sequential Patterns from SP-Tree |
title_full_unstemmed | Tree-Miner: Mining Sequential Patterns from SP-Tree |
title_short | Tree-Miner: Mining Sequential Patterns from SP-Tree |
title_sort | tree-miner: mining sequential patterns from sp-tree |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206258/ http://dx.doi.org/10.1007/978-3-030-47436-2_4 |
work_keys_str_mv | AT rizveeredwanahmed treeminerminingsequentialpatternsfromsptree AT arefinmohammadfahim treeminerminingsequentialpatternsfromsptree AT ahmedchowdhuryfarhan treeminerminingsequentialpatternsfromsptree |