Cargando…

Unrooted unordered homeomorphic subtree alignment of RNA trees

We generalize some current approaches for RNA tree alignment, which are traditionally confined to ordered rooted mappings, to also consider unordered unrooted mappings. We define the Homeomorphic Subtree Alignment problem (HSA), and present a new algorithm which applies to several modes, combining g...

Descripción completa

Detalles Bibliográficos
Autores principales: Milo, Nimrod, Zakov, Shay, Katzenelson, Erez, Bachmat, Eitan, Dinitz, Yefim, Ziv-Ukelson, Michal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765143/
https://www.ncbi.nlm.nih.gov/pubmed/23590940
http://dx.doi.org/10.1186/1748-7188-8-13
_version_ 1782283233428242432
author Milo, Nimrod
Zakov, Shay
Katzenelson, Erez
Bachmat, Eitan
Dinitz, Yefim
Ziv-Ukelson, Michal
author_facet Milo, Nimrod
Zakov, Shay
Katzenelson, Erez
Bachmat, Eitan
Dinitz, Yefim
Ziv-Ukelson, Michal
author_sort Milo, Nimrod
collection PubMed
description We generalize some current approaches for RNA tree alignment, which are traditionally confined to ordered rooted mappings, to also consider unordered unrooted mappings. We define the Homeomorphic Subtree Alignment problem (HSA), and present a new algorithm which applies to several modes, combining global or local, ordered or unordered, and rooted or unrooted tree alignments. Our algorithm generalizes previous algorithms that either solved the problem in an asymmetric manner, or were restricted to the rooted and/or ordered cases. Focusing here on the most general unrooted unordered case, we show that for input trees T and S, our algorithm has an O(n(T)n(S) + min(d(T),d(S))L(T)L(S)) time complexity, where n(T),L(T) and d(T) are the number of nodes, the number of leaves, and the maximum node degree in T, respectively (satisfying d(T) ≤ L(T) ≤ n(T)), and similarly for n(S),L(S) and d(S) with respect to the tree S. This improves the time complexity of previous algorithms for less general variants of the problem. In order to obtain this time bound for HSA, we developed new algorithms for a generalized variant of the Min-Cost Bipartite Matching problem (MCM), as well as to two derivatives of this problem, entitled All-Cavity-MCM and All-Pairs-Cavity-MCM. For two input sets of size n and m, where n ≤ m, MCM and both its cavity derivatives are solved in O(n(3) + nm) time, without the usage of priority queues (e.g. Fibonacci heaps) or other complex data structures. This gives the first cubic time algorithm for All-Pairs-Cavity-MCM, and improves the running times of MCM and All-Cavity-MCM problems in the unbalanced case where n ≪ m. We implemented the algorithm (in all modes mentioned above) as a graphical software tool which computes and displays similarities between secondary structures of RNA given as input, and employed it to a preliminary experiment in which we ran all-against-all inter-family pairwise alignments of RNAse P and Hammerhead RNA family members, exposing new similarities which could not be detected by the traditional rooted ordered alignment approaches. The results demonstrate that our approach can be used to expose structural similarity between some RNAs with higher sensitivity than the traditional rooted ordered alignment approaches. Source code and web-interface for our tool can be found in http://www.cs.bgu.ac.il/\~negevcb/FRUUT.
format Online
Article
Text
id pubmed-3765143
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37651432013-09-10 Unrooted unordered homeomorphic subtree alignment of RNA trees Milo, Nimrod Zakov, Shay Katzenelson, Erez Bachmat, Eitan Dinitz, Yefim Ziv-Ukelson, Michal Algorithms Mol Biol Research We generalize some current approaches for RNA tree alignment, which are traditionally confined to ordered rooted mappings, to also consider unordered unrooted mappings. We define the Homeomorphic Subtree Alignment problem (HSA), and present a new algorithm which applies to several modes, combining global or local, ordered or unordered, and rooted or unrooted tree alignments. Our algorithm generalizes previous algorithms that either solved the problem in an asymmetric manner, or were restricted to the rooted and/or ordered cases. Focusing here on the most general unrooted unordered case, we show that for input trees T and S, our algorithm has an O(n(T)n(S) + min(d(T),d(S))L(T)L(S)) time complexity, where n(T),L(T) and d(T) are the number of nodes, the number of leaves, and the maximum node degree in T, respectively (satisfying d(T) ≤ L(T) ≤ n(T)), and similarly for n(S),L(S) and d(S) with respect to the tree S. This improves the time complexity of previous algorithms for less general variants of the problem. In order to obtain this time bound for HSA, we developed new algorithms for a generalized variant of the Min-Cost Bipartite Matching problem (MCM), as well as to two derivatives of this problem, entitled All-Cavity-MCM and All-Pairs-Cavity-MCM. For two input sets of size n and m, where n ≤ m, MCM and both its cavity derivatives are solved in O(n(3) + nm) time, without the usage of priority queues (e.g. Fibonacci heaps) or other complex data structures. This gives the first cubic time algorithm for All-Pairs-Cavity-MCM, and improves the running times of MCM and All-Cavity-MCM problems in the unbalanced case where n ≪ m. We implemented the algorithm (in all modes mentioned above) as a graphical software tool which computes and displays similarities between secondary structures of RNA given as input, and employed it to a preliminary experiment in which we ran all-against-all inter-family pairwise alignments of RNAse P and Hammerhead RNA family members, exposing new similarities which could not be detected by the traditional rooted ordered alignment approaches. The results demonstrate that our approach can be used to expose structural similarity between some RNAs with higher sensitivity than the traditional rooted ordered alignment approaches. Source code and web-interface for our tool can be found in http://www.cs.bgu.ac.il/\~negevcb/FRUUT. BioMed Central 2013-04-16 /pmc/articles/PMC3765143/ /pubmed/23590940 http://dx.doi.org/10.1186/1748-7188-8-13 Text en Copyright © 2013 Milo et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Milo, Nimrod
Zakov, Shay
Katzenelson, Erez
Bachmat, Eitan
Dinitz, Yefim
Ziv-Ukelson, Michal
Unrooted unordered homeomorphic subtree alignment of RNA trees
title Unrooted unordered homeomorphic subtree alignment of RNA trees
title_full Unrooted unordered homeomorphic subtree alignment of RNA trees
title_fullStr Unrooted unordered homeomorphic subtree alignment of RNA trees
title_full_unstemmed Unrooted unordered homeomorphic subtree alignment of RNA trees
title_short Unrooted unordered homeomorphic subtree alignment of RNA trees
title_sort unrooted unordered homeomorphic subtree alignment of rna trees
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765143/
https://www.ncbi.nlm.nih.gov/pubmed/23590940
http://dx.doi.org/10.1186/1748-7188-8-13
work_keys_str_mv AT milonimrod unrootedunorderedhomeomorphicsubtreealignmentofrnatrees
AT zakovshay unrootedunorderedhomeomorphicsubtreealignmentofrnatrees
AT katzenelsonerez unrootedunorderedhomeomorphicsubtreealignmentofrnatrees
AT bachmateitan unrootedunorderedhomeomorphicsubtreealignmentofrnatrees
AT dinitzyefim unrootedunorderedhomeomorphicsubtreealignmentofrnatrees
AT zivukelsonmichal unrootedunorderedhomeomorphicsubtreealignmentofrnatrees