Cargando…
Optimal computation of all tandem repeats in a weighted sequence
BACKGROUND: Tandem duplication, in the context of molecular biology, occurs as a result of mutational events in which an original segment of DNA is converted into a sequence of individual copies. More formally, a repetition or tandem repeat in a string of letters consists of exact concatenations of...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4152798/ https://www.ncbi.nlm.nih.gov/pubmed/25221616 http://dx.doi.org/10.1186/s13015-014-0021-5 |
_version_ | 1782333180216344576 |
---|---|
author | Barton, Carl Iliopoulos, Costas S Pissis, Solon P |
author_facet | Barton, Carl Iliopoulos, Costas S Pissis, Solon P |
author_sort | Barton, Carl |
collection | PubMed |
description | BACKGROUND: Tandem duplication, in the context of molecular biology, occurs as a result of mutational events in which an original segment of DNA is converted into a sequence of individual copies. More formally, a repetition or tandem repeat in a string of letters consists of exact concatenations of identical factors of the string. Biologists are interested in approximate tandem repeats and not necessarily only in exact tandem repeats. A weighted sequence is a string in which a set of letters may occur at each position with respective probabilities of occurrence. It naturally arises in many biological contexts and provides a method to realise the approximation among distinct adjacent occurrences of the same DNA segment. RESULTS: Crochemore’s repetitions algorithm, also referred to as Crochemore’s partitioning algorithm, was introduced in 1981, and was the first optimal [Formula: see text]-time algorithm to compute all repetitions in a string of length n. In this article, we present a novel variant of Crochemore’s partitioning algorithm for weighted sequences, which requires optimal [Formula: see text] time, thus improving on the best known [Formula: see text]-time algorithm (Zhang et al., 2013) for computing all repetitions in a weighted sequence of length n. |
format | Online Article Text |
id | pubmed-4152798 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41527982014-09-12 Optimal computation of all tandem repeats in a weighted sequence Barton, Carl Iliopoulos, Costas S Pissis, Solon P Algorithms Mol Biol Research BACKGROUND: Tandem duplication, in the context of molecular biology, occurs as a result of mutational events in which an original segment of DNA is converted into a sequence of individual copies. More formally, a repetition or tandem repeat in a string of letters consists of exact concatenations of identical factors of the string. Biologists are interested in approximate tandem repeats and not necessarily only in exact tandem repeats. A weighted sequence is a string in which a set of letters may occur at each position with respective probabilities of occurrence. It naturally arises in many biological contexts and provides a method to realise the approximation among distinct adjacent occurrences of the same DNA segment. RESULTS: Crochemore’s repetitions algorithm, also referred to as Crochemore’s partitioning algorithm, was introduced in 1981, and was the first optimal [Formula: see text]-time algorithm to compute all repetitions in a string of length n. In this article, we present a novel variant of Crochemore’s partitioning algorithm for weighted sequences, which requires optimal [Formula: see text] time, thus improving on the best known [Formula: see text]-time algorithm (Zhang et al., 2013) for computing all repetitions in a weighted sequence of length n. BioMed Central 2014-08-16 /pmc/articles/PMC4152798/ /pubmed/25221616 http://dx.doi.org/10.1186/s13015-014-0021-5 Text en Copyright © 2014 Barton et al.; licensee BioMed Central http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Barton, Carl Iliopoulos, Costas S Pissis, Solon P Optimal computation of all tandem repeats in a weighted sequence |
title | Optimal computation of all tandem repeats in a weighted sequence |
title_full | Optimal computation of all tandem repeats in a weighted sequence |
title_fullStr | Optimal computation of all tandem repeats in a weighted sequence |
title_full_unstemmed | Optimal computation of all tandem repeats in a weighted sequence |
title_short | Optimal computation of all tandem repeats in a weighted sequence |
title_sort | optimal computation of all tandem repeats in a weighted sequence |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4152798/ https://www.ncbi.nlm.nih.gov/pubmed/25221616 http://dx.doi.org/10.1186/s13015-014-0021-5 |
work_keys_str_mv | AT bartoncarl optimalcomputationofalltandemrepeatsinaweightedsequence AT iliopouloscostass optimalcomputationofalltandemrepeatsinaweightedsequence AT pississolonp optimalcomputationofalltandemrepeatsinaweightedsequence |