Cargando…

Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms

Grammatical inference methods are expected to find grammatical structures hidden in biological sequences. One hopes that studies of grammar serve as an appropriate tool for theory formation. Thus, we have developed JSequitur for automatically generating the grammatical structure of biological sequen...

Descripción completa

Detalles Bibliográficos
Autores principales: Galbadrakh, Bulgan, Lee, Kyung-Eun, Park, Hyun-Seok
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Korea Genome Organization 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3543929/
https://www.ncbi.nlm.nih.gov/pubmed/23346041
http://dx.doi.org/10.5808/GI.2012.10.4.266
_version_ 1782255721660809216
author Galbadrakh, Bulgan
Lee, Kyung-Eun
Park, Hyun-Seok
author_facet Galbadrakh, Bulgan
Lee, Kyung-Eun
Park, Hyun-Seok
author_sort Galbadrakh, Bulgan
collection PubMed
description Grammatical inference methods are expected to find grammatical structures hidden in biological sequences. One hopes that studies of grammar serve as an appropriate tool for theory formation. Thus, we have developed JSequitur for automatically generating the grammatical structure of biological sequences in an inference framework of string compression algorithms. Our original motivation was to find any grammatical traits of several cancer genes that can be detected by string compression algorithms. Through this research, we could not find any meaningful unique traits of the cancer genes yet, but we could observe some interesting traits in regards to the relationship among gene length, similarity of sequences, the patterns of the generated grammar, and compression rate.
format Online
Article
Text
id pubmed-3543929
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Korea Genome Organization
record_format MEDLINE/PubMed
spelling pubmed-35439292013-01-23 Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms Galbadrakh, Bulgan Lee, Kyung-Eun Park, Hyun-Seok Genomics Inform Application Note Grammatical inference methods are expected to find grammatical structures hidden in biological sequences. One hopes that studies of grammar serve as an appropriate tool for theory formation. Thus, we have developed JSequitur for automatically generating the grammatical structure of biological sequences in an inference framework of string compression algorithms. Our original motivation was to find any grammatical traits of several cancer genes that can be detected by string compression algorithms. Through this research, we could not find any meaningful unique traits of the cancer genes yet, but we could observe some interesting traits in regards to the relationship among gene length, similarity of sequences, the patterns of the generated grammar, and compression rate. Korea Genome Organization 2012-12 2012-12-31 /pmc/articles/PMC3543929/ /pubmed/23346041 http://dx.doi.org/10.5808/GI.2012.10.4.266 Text en Copyright © 2012 by The Korea Genome Organization http://creativecommons.org/licenses/by-nc/3.0/ It is identical to the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/).
spellingShingle Application Note
Galbadrakh, Bulgan
Lee, Kyung-Eun
Park, Hyun-Seok
Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms
title Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms
title_full Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms
title_fullStr Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms
title_full_unstemmed Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms
title_short Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms
title_sort developing jsequitur to study the hierarchical structure of biological sequences in a grammatical inference framework of string compression algorithms
topic Application Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3543929/
https://www.ncbi.nlm.nih.gov/pubmed/23346041
http://dx.doi.org/10.5808/GI.2012.10.4.266
work_keys_str_mv AT galbadrakhbulgan developingjsequiturtostudythehierarchicalstructureofbiologicalsequencesinagrammaticalinferenceframeworkofstringcompressionalgorithms
AT leekyungeun developingjsequiturtostudythehierarchicalstructureofbiologicalsequencesinagrammaticalinferenceframeworkofstringcompressionalgorithms
AT parkhyunseok developingjsequiturtostudythehierarchicalstructureofbiologicalsequencesinagrammaticalinferenceframeworkofstringcompressionalgorithms