Cargando…
Efficient Mining of Interesting Patterns in Large Biological Sequences
Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology,...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Korea Genome Organization
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3475482/ https://www.ncbi.nlm.nih.gov/pubmed/23105928 http://dx.doi.org/10.5808/GI.2012.10.1.44 |
_version_ | 1782246954787405824 |
---|---|
author | Rashid, Md. Mamunur Karim, Md. Rezaul Jeong, Byeong-Soo Choi, Ho-Jin |
author_facet | Rashid, Md. Mamunur Karim, Md. Rezaul Jeong, Byeong-Soo Choi, Ho-Jin |
author_sort | Rashid, Md. Mamunur |
collection | PubMed |
description | Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology, however, a pattern that is not frequent may still be considered very informative if its actual support frequency exceeds the prior expectation by a large margin. In this paper, we propose a new interesting measure that can provide meaningful biological information. We also propose an efficient index-based method for mining such interesting patterns. Experimental results show that our approach can find interesting patterns within an acceptable computation time. |
format | Online Article Text |
id | pubmed-3475482 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Korea Genome Organization |
record_format | MEDLINE/PubMed |
spelling | pubmed-34754822012-10-26 Efficient Mining of Interesting Patterns in Large Biological Sequences Rashid, Md. Mamunur Karim, Md. Rezaul Jeong, Byeong-Soo Choi, Ho-Jin Genomics Inf Article Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology, however, a pattern that is not frequent may still be considered very informative if its actual support frequency exceeds the prior expectation by a large margin. In this paper, we propose a new interesting measure that can provide meaningful biological information. We also propose an efficient index-based method for mining such interesting patterns. Experimental results show that our approach can find interesting patterns within an acceptable computation time. Korea Genome Organization 2012-03 2012-03-31 /pmc/articles/PMC3475482/ /pubmed/23105928 http://dx.doi.org/10.5808/GI.2012.10.1.44 Text en Copyright © 2012 by The Korea Genome Organization http://creativecommons.org/licenses/by-nc/3.0 It is identical to the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/). |
spellingShingle | Article Rashid, Md. Mamunur Karim, Md. Rezaul Jeong, Byeong-Soo Choi, Ho-Jin Efficient Mining of Interesting Patterns in Large Biological Sequences |
title | Efficient Mining of Interesting Patterns in Large Biological Sequences |
title_full | Efficient Mining of Interesting Patterns in Large Biological Sequences |
title_fullStr | Efficient Mining of Interesting Patterns in Large Biological Sequences |
title_full_unstemmed | Efficient Mining of Interesting Patterns in Large Biological Sequences |
title_short | Efficient Mining of Interesting Patterns in Large Biological Sequences |
title_sort | efficient mining of interesting patterns in large biological sequences |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3475482/ https://www.ncbi.nlm.nih.gov/pubmed/23105928 http://dx.doi.org/10.5808/GI.2012.10.1.44 |
work_keys_str_mv | AT rashidmdmamunur efficientminingofinterestingpatternsinlargebiologicalsequences AT karimmdrezaul efficientminingofinterestingpatternsinlargebiologicalsequences AT jeongbyeongsoo efficientminingofinterestingpatternsinlargebiologicalsequences AT choihojin efficientminingofinterestingpatternsinlargebiologicalsequences |