Cargando…
Relative Suffix Trees
Suffix trees are one of the most versatile data structures in stringology, with many applications in bioinformatics. Their main drawback is their size, which can be tens of times larger than the input sequence. Much effort has been put into reducing the space usage, leading ultimately to compressed...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5956352/ https://www.ncbi.nlm.nih.gov/pubmed/29795706 http://dx.doi.org/10.1093/comjnl/bxx108 |
_version_ | 1783323872733429760 |
---|---|
author | Farruggia, Andrea Gagie, Travis Navarro, Gonzalo Puglisi, Simon J Sirén, Jouni |
author_facet | Farruggia, Andrea Gagie, Travis Navarro, Gonzalo Puglisi, Simon J Sirén, Jouni |
author_sort | Farruggia, Andrea |
collection | PubMed |
description | Suffix trees are one of the most versatile data structures in stringology, with many applications in bioinformatics. Their main drawback is their size, which can be tens of times larger than the input sequence. Much effort has been put into reducing the space usage, leading ultimately to compressed suffix trees. These compressed data structures can efficiently simulate the suffix tree, while using space proportional to a compressed representation of the sequence. In this work, we take a new approach to compressed suffix trees for repetitive sequence collections, such as collections of individual genomes. We compress the suffix trees of individual sequences relative to the suffix tree of a reference sequence. These relative data structures provide competitive time/space trade-offs, being almost as small as the smallest compressed suffix trees for repetitive collections, and competitive in time with the largest and fastest compressed suffix trees. |
format | Online Article Text |
id | pubmed-5956352 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-59563522018-05-21 Relative Suffix Trees Farruggia, Andrea Gagie, Travis Navarro, Gonzalo Puglisi, Simon J Sirén, Jouni Comput J Original article Suffix trees are one of the most versatile data structures in stringology, with many applications in bioinformatics. Their main drawback is their size, which can be tens of times larger than the input sequence. Much effort has been put into reducing the space usage, leading ultimately to compressed suffix trees. These compressed data structures can efficiently simulate the suffix tree, while using space proportional to a compressed representation of the sequence. In this work, we take a new approach to compressed suffix trees for repetitive sequence collections, such as collections of individual genomes. We compress the suffix trees of individual sequences relative to the suffix tree of a reference sequence. These relative data structures provide competitive time/space trade-offs, being almost as small as the smallest compressed suffix trees for repetitive collections, and competitive in time with the largest and fastest compressed suffix trees. Oxford University Press 2018-05 2017-11-21 /pmc/articles/PMC5956352/ /pubmed/29795706 http://dx.doi.org/10.1093/comjnl/bxx108 Text en © The British Computer Society 2017. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original article Farruggia, Andrea Gagie, Travis Navarro, Gonzalo Puglisi, Simon J Sirén, Jouni Relative Suffix Trees |
title | Relative Suffix Trees |
title_full | Relative Suffix Trees |
title_fullStr | Relative Suffix Trees |
title_full_unstemmed | Relative Suffix Trees |
title_short | Relative Suffix Trees |
title_sort | relative suffix trees |
topic | Original article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5956352/ https://www.ncbi.nlm.nih.gov/pubmed/29795706 http://dx.doi.org/10.1093/comjnl/bxx108 |
work_keys_str_mv | AT farruggiaandrea relativesuffixtrees AT gagietravis relativesuffixtrees AT navarrogonzalo relativesuffixtrees AT puglisisimonj relativesuffixtrees AT sirenjouni relativesuffixtrees |