Cargando…

Integer programming-based method for grammar-based tree compression and its application to pattern extraction of glycan tree structures

BACKGROUND: A bisection-type algorithm for the grammar-based compression of tree-structured data has been proposed recently. In this framework, an elementary ordered-tree grammar (EOTG) and an elementary unordered-tree grammar (EUTG) were defined, and an approximation algorithm was proposed. RESULTS...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Yang, Hayashida, Morihiro, Akutsu, Tatsuya
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3024861/
https://www.ncbi.nlm.nih.gov/pubmed/21172054
http://dx.doi.org/10.1186/1471-2105-11-S11-S4
_version_ 1782196822366748672
author Zhao, Yang
Hayashida, Morihiro
Akutsu, Tatsuya
author_facet Zhao, Yang
Hayashida, Morihiro
Akutsu, Tatsuya
author_sort Zhao, Yang
collection PubMed
description BACKGROUND: A bisection-type algorithm for the grammar-based compression of tree-structured data has been proposed recently. In this framework, an elementary ordered-tree grammar (EOTG) and an elementary unordered-tree grammar (EUTG) were defined, and an approximation algorithm was proposed. RESULTS: In this paper, we propose an integer programming-based method that finds the minimum context-free grammar (CFG) for a given string under the condition that at most two symbols appear on the right-hand side of each production rule. Next, we extend this method to find the minimum EOTG and EUTG grammars for given ordered and unordered trees, respectively. Then, we conduct computational experiments for the ordered and unordered artificial trees. Finally, we apply our methods to pattern extraction of glycan tree structures. CONCLUSIONS: We propose integer programming-based methods that find the minimum CFG, EOTG, and EUTG for given strings, ordered and unordered trees. Our proposed methods for trees are useful for extracting patterns of glycan tree structures.
format Text
id pubmed-3024861
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30248612011-01-22 Integer programming-based method for grammar-based tree compression and its application to pattern extraction of glycan tree structures Zhao, Yang Hayashida, Morihiro Akutsu, Tatsuya BMC Bioinformatics Research BACKGROUND: A bisection-type algorithm for the grammar-based compression of tree-structured data has been proposed recently. In this framework, an elementary ordered-tree grammar (EOTG) and an elementary unordered-tree grammar (EUTG) were defined, and an approximation algorithm was proposed. RESULTS: In this paper, we propose an integer programming-based method that finds the minimum context-free grammar (CFG) for a given string under the condition that at most two symbols appear on the right-hand side of each production rule. Next, we extend this method to find the minimum EOTG and EUTG grammars for given ordered and unordered trees, respectively. Then, we conduct computational experiments for the ordered and unordered artificial trees. Finally, we apply our methods to pattern extraction of glycan tree structures. CONCLUSIONS: We propose integer programming-based methods that find the minimum CFG, EOTG, and EUTG for given strings, ordered and unordered trees. Our proposed methods for trees are useful for extracting patterns of glycan tree structures. BioMed Central 2010-12-14 /pmc/articles/PMC3024861/ /pubmed/21172054 http://dx.doi.org/10.1186/1471-2105-11-S11-S4 Text en Copyright ©2010 Akutsu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Zhao, Yang
Hayashida, Morihiro
Akutsu, Tatsuya
Integer programming-based method for grammar-based tree compression and its application to pattern extraction of glycan tree structures
title Integer programming-based method for grammar-based tree compression and its application to pattern extraction of glycan tree structures
title_full Integer programming-based method for grammar-based tree compression and its application to pattern extraction of glycan tree structures
title_fullStr Integer programming-based method for grammar-based tree compression and its application to pattern extraction of glycan tree structures
title_full_unstemmed Integer programming-based method for grammar-based tree compression and its application to pattern extraction of glycan tree structures
title_short Integer programming-based method for grammar-based tree compression and its application to pattern extraction of glycan tree structures
title_sort integer programming-based method for grammar-based tree compression and its application to pattern extraction of glycan tree structures
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3024861/
https://www.ncbi.nlm.nih.gov/pubmed/21172054
http://dx.doi.org/10.1186/1471-2105-11-S11-S4
work_keys_str_mv AT zhaoyang integerprogrammingbasedmethodforgrammarbasedtreecompressionanditsapplicationtopatternextractionofglycantreestructures
AT hayashidamorihiro integerprogrammingbasedmethodforgrammarbasedtreecompressionanditsapplicationtopatternextractionofglycantreestructures
AT akutsutatsuya integerprogrammingbasedmethodforgrammarbasedtreecompressionanditsapplicationtopatternextractionofglycantreestructures