Cargando…
Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms
Grammatical inference methods are expected to find grammatical structures hidden in biological sequences. One hopes that studies of grammar serve as an appropriate tool for theory formation. Thus, we have developed JSequitur for automatically generating the grammatical structure of biological sequen...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Korea Genome Organization
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3543929/ https://www.ncbi.nlm.nih.gov/pubmed/23346041 http://dx.doi.org/10.5808/GI.2012.10.4.266 |
_version_ | 1782255721660809216 |
---|---|
author | Galbadrakh, Bulgan Lee, Kyung-Eun Park, Hyun-Seok |
author_facet | Galbadrakh, Bulgan Lee, Kyung-Eun Park, Hyun-Seok |
author_sort | Galbadrakh, Bulgan |
collection | PubMed |
description | Grammatical inference methods are expected to find grammatical structures hidden in biological sequences. One hopes that studies of grammar serve as an appropriate tool for theory formation. Thus, we have developed JSequitur for automatically generating the grammatical structure of biological sequences in an inference framework of string compression algorithms. Our original motivation was to find any grammatical traits of several cancer genes that can be detected by string compression algorithms. Through this research, we could not find any meaningful unique traits of the cancer genes yet, but we could observe some interesting traits in regards to the relationship among gene length, similarity of sequences, the patterns of the generated grammar, and compression rate. |
format | Online Article Text |
id | pubmed-3543929 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Korea Genome Organization |
record_format | MEDLINE/PubMed |
spelling | pubmed-35439292013-01-23 Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms Galbadrakh, Bulgan Lee, Kyung-Eun Park, Hyun-Seok Genomics Inform Application Note Grammatical inference methods are expected to find grammatical structures hidden in biological sequences. One hopes that studies of grammar serve as an appropriate tool for theory formation. Thus, we have developed JSequitur for automatically generating the grammatical structure of biological sequences in an inference framework of string compression algorithms. Our original motivation was to find any grammatical traits of several cancer genes that can be detected by string compression algorithms. Through this research, we could not find any meaningful unique traits of the cancer genes yet, but we could observe some interesting traits in regards to the relationship among gene length, similarity of sequences, the patterns of the generated grammar, and compression rate. Korea Genome Organization 2012-12 2012-12-31 /pmc/articles/PMC3543929/ /pubmed/23346041 http://dx.doi.org/10.5808/GI.2012.10.4.266 Text en Copyright © 2012 by The Korea Genome Organization http://creativecommons.org/licenses/by-nc/3.0/ It is identical to the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/). |
spellingShingle | Application Note Galbadrakh, Bulgan Lee, Kyung-Eun Park, Hyun-Seok Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms |
title | Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms |
title_full | Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms |
title_fullStr | Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms |
title_full_unstemmed | Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms |
title_short | Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms |
title_sort | developing jsequitur to study the hierarchical structure of biological sequences in a grammatical inference framework of string compression algorithms |
topic | Application Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3543929/ https://www.ncbi.nlm.nih.gov/pubmed/23346041 http://dx.doi.org/10.5808/GI.2012.10.4.266 |
work_keys_str_mv | AT galbadrakhbulgan developingjsequiturtostudythehierarchicalstructureofbiologicalsequencesinagrammaticalinferenceframeworkofstringcompressionalgorithms AT leekyungeun developingjsequiturtostudythehierarchicalstructureofbiologicalsequencesinagrammaticalinferenceframeworkofstringcompressionalgorithms AT parkhyunseok developingjsequiturtostudythehierarchicalstructureofbiologicalsequencesinagrammaticalinferenceframeworkofstringcompressionalgorithms |