Cargando…

Grammar-based compression approach to extraction of common rules among multiple trees of glycans and RNAs

BACKGROUND: Many tree structures are found in nature and organisms. Such trees are believed to be constructed on the basis of certain rules. We have previously developed grammar-based compression methods for ordered and unordered single trees, based on bisection-type tree grammars. Here, these metho...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Yang, Hayashida, Morihiro, Cao, Yue, Hwang, Jaewook, Akutsu, Tatsuya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4419412/
https://www.ncbi.nlm.nih.gov/pubmed/25907438
http://dx.doi.org/10.1186/s12859-015-0558-4
_version_ 1782369567248482304
author Zhao, Yang
Hayashida, Morihiro
Cao, Yue
Hwang, Jaewook
Akutsu, Tatsuya
author_facet Zhao, Yang
Hayashida, Morihiro
Cao, Yue
Hwang, Jaewook
Akutsu, Tatsuya
author_sort Zhao, Yang
collection PubMed
description BACKGROUND: Many tree structures are found in nature and organisms. Such trees are believed to be constructed on the basis of certain rules. We have previously developed grammar-based compression methods for ordered and unordered single trees, based on bisection-type tree grammars. Here, these methods find construction rules for one single tree. On the other hand, specified construction rules can be utilized to generate multiple similar trees. RESULTS: Therefore, in this paper, we develop novel methods to discover common rules for the construction of multiple distinct trees, by improving and extending the previous methods using integer programming. We apply our proposed methods to several sets of glycans and RNA secondary structures, which play important roles in cellular systems, and can be regarded as tree structures. The results suggest that our method can be successfully applied to determining the minimum grammar and several common rules among glycans and RNAs. CONCLUSIONS: We propose integer programming-based methods MinSEOTGMul and MinSEUTGMul for the determination of the minimum grammars constructing multiple ordered and unordered trees, respectively. The proposed methods can provide clues for the determination of hierarchical structures contained in tree-structured biological data, beyond the extraction of frequent patterns. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0558-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4419412
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44194122015-05-06 Grammar-based compression approach to extraction of common rules among multiple trees of glycans and RNAs Zhao, Yang Hayashida, Morihiro Cao, Yue Hwang, Jaewook Akutsu, Tatsuya BMC Bioinformatics Research Article BACKGROUND: Many tree structures are found in nature and organisms. Such trees are believed to be constructed on the basis of certain rules. We have previously developed grammar-based compression methods for ordered and unordered single trees, based on bisection-type tree grammars. Here, these methods find construction rules for one single tree. On the other hand, specified construction rules can be utilized to generate multiple similar trees. RESULTS: Therefore, in this paper, we develop novel methods to discover common rules for the construction of multiple distinct trees, by improving and extending the previous methods using integer programming. We apply our proposed methods to several sets of glycans and RNA secondary structures, which play important roles in cellular systems, and can be regarded as tree structures. The results suggest that our method can be successfully applied to determining the minimum grammar and several common rules among glycans and RNAs. CONCLUSIONS: We propose integer programming-based methods MinSEOTGMul and MinSEUTGMul for the determination of the minimum grammars constructing multiple ordered and unordered trees, respectively. The proposed methods can provide clues for the determination of hierarchical structures contained in tree-structured biological data, beyond the extraction of frequent patterns. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0558-4) contains supplementary material, which is available to authorized users. BioMed Central 2015-04-24 /pmc/articles/PMC4419412/ /pubmed/25907438 http://dx.doi.org/10.1186/s12859-015-0558-4 Text en © Zhao et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Zhao, Yang
Hayashida, Morihiro
Cao, Yue
Hwang, Jaewook
Akutsu, Tatsuya
Grammar-based compression approach to extraction of common rules among multiple trees of glycans and RNAs
title Grammar-based compression approach to extraction of common rules among multiple trees of glycans and RNAs
title_full Grammar-based compression approach to extraction of common rules among multiple trees of glycans and RNAs
title_fullStr Grammar-based compression approach to extraction of common rules among multiple trees of glycans and RNAs
title_full_unstemmed Grammar-based compression approach to extraction of common rules among multiple trees of glycans and RNAs
title_short Grammar-based compression approach to extraction of common rules among multiple trees of glycans and RNAs
title_sort grammar-based compression approach to extraction of common rules among multiple trees of glycans and rnas
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4419412/
https://www.ncbi.nlm.nih.gov/pubmed/25907438
http://dx.doi.org/10.1186/s12859-015-0558-4
work_keys_str_mv AT zhaoyang grammarbasedcompressionapproachtoextractionofcommonrulesamongmultipletreesofglycansandrnas
AT hayashidamorihiro grammarbasedcompressionapproachtoextractionofcommonrulesamongmultipletreesofglycansandrnas
AT caoyue grammarbasedcompressionapproachtoextractionofcommonrulesamongmultipletreesofglycansandrnas
AT hwangjaewook grammarbasedcompressionapproachtoextractionofcommonrulesamongmultipletreesofglycansandrnas
AT akutsutatsuya grammarbasedcompressionapproachtoextractionofcommonrulesamongmultipletreesofglycansandrnas