Cargando…

A primer to frequent itemset mining for bioinformatics

Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of product...

Descripción completa

Detalles Bibliográficos
Autores principales: Naulaerts, Stefan, Meysman, Pieter, Bittremieux, Wout, Vu, Trung Nghia, Vanden Berghe, Wim, Goethals, Bart, Laukens, Kris
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4364064/
https://www.ncbi.nlm.nih.gov/pubmed/24162173
http://dx.doi.org/10.1093/bib/bbt074
_version_ 1782362016075218944
author Naulaerts, Stefan
Meysman, Pieter
Bittremieux, Wout
Vu, Trung Nghia
Vanden Berghe, Wim
Goethals, Bart
Laukens, Kris
author_facet Naulaerts, Stefan
Meysman, Pieter
Bittremieux, Wout
Vu, Trung Nghia
Vanden Berghe, Wim
Goethals, Bart
Laukens, Kris
author_sort Naulaerts, Stefan
collection PubMed
description Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other interesting properties, these techniques have proven their value in biological data analysis. Nevertheless, information about the bioinformatics applications of these techniques remains scattered. In this primer, we introduce frequent itemset mining and their derived association rules for life scientists. We give an overview of various algorithms, and illustrate how they can be used in several real-life bioinformatics application domains. We end with a discussion of the future potential and open challenges for frequent itemset mining in the life sciences.
format Online
Article
Text
id pubmed-4364064
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-43640642015-03-25 A primer to frequent itemset mining for bioinformatics Naulaerts, Stefan Meysman, Pieter Bittremieux, Wout Vu, Trung Nghia Vanden Berghe, Wim Goethals, Bart Laukens, Kris Brief Bioinform Papers Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other interesting properties, these techniques have proven their value in biological data analysis. Nevertheless, information about the bioinformatics applications of these techniques remains scattered. In this primer, we introduce frequent itemset mining and their derived association rules for life scientists. We give an overview of various algorithms, and illustrate how they can be used in several real-life bioinformatics application domains. We end with a discussion of the future potential and open challenges for frequent itemset mining in the life sciences. Oxford University Press 2015-03 2013-10-26 /pmc/articles/PMC4364064/ /pubmed/24162173 http://dx.doi.org/10.1093/bib/bbt074 Text en © The Author 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Papers
Naulaerts, Stefan
Meysman, Pieter
Bittremieux, Wout
Vu, Trung Nghia
Vanden Berghe, Wim
Goethals, Bart
Laukens, Kris
A primer to frequent itemset mining for bioinformatics
title A primer to frequent itemset mining for bioinformatics
title_full A primer to frequent itemset mining for bioinformatics
title_fullStr A primer to frequent itemset mining for bioinformatics
title_full_unstemmed A primer to frequent itemset mining for bioinformatics
title_short A primer to frequent itemset mining for bioinformatics
title_sort primer to frequent itemset mining for bioinformatics
topic Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4364064/
https://www.ncbi.nlm.nih.gov/pubmed/24162173
http://dx.doi.org/10.1093/bib/bbt074
work_keys_str_mv AT naulaertsstefan aprimertofrequentitemsetminingforbioinformatics
AT meysmanpieter aprimertofrequentitemsetminingforbioinformatics
AT bittremieuxwout aprimertofrequentitemsetminingforbioinformatics
AT vutrungnghia aprimertofrequentitemsetminingforbioinformatics
AT vandenberghewim aprimertofrequentitemsetminingforbioinformatics
AT goethalsbart aprimertofrequentitemsetminingforbioinformatics
AT laukenskris aprimertofrequentitemsetminingforbioinformatics
AT naulaertsstefan primertofrequentitemsetminingforbioinformatics
AT meysmanpieter primertofrequentitemsetminingforbioinformatics
AT bittremieuxwout primertofrequentitemsetminingforbioinformatics
AT vutrungnghia primertofrequentitemsetminingforbioinformatics
AT vandenberghewim primertofrequentitemsetminingforbioinformatics
AT goethalsbart primertofrequentitemsetminingforbioinformatics
AT laukenskris primertofrequentitemsetminingforbioinformatics