Cargando…

Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm

The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K...

Descripción completa

Detalles Bibliográficos
Autores principales: Glunčić, Matko, Paar, Vladimir
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3592446/
https://www.ncbi.nlm.nih.gov/pubmed/22977183
http://dx.doi.org/10.1093/nar/gks721
_version_ 1782262117838094336
author Glunčić, Matko
Paar, Vladimir
author_facet Glunčić, Matko
Paar, Vladimir
author_sort Glunčić, Matko
collection PubMed
description The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes).
format Online
Article
Text
id pubmed-3592446
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-35924462013-03-08 Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm Glunčić, Matko Paar, Vladimir Nucleic Acids Res Methods Online The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes). Oxford University Press 2013-01 2012-09-12 /pmc/articles/PMC3592446/ /pubmed/22977183 http://dx.doi.org/10.1093/nar/gks721 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Glunčić, Matko
Paar, Vladimir
Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm
title Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm
title_full Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm
title_fullStr Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm
title_full_unstemmed Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm
title_short Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm
title_sort direct mapping of symbolic dna sequence into frequency domain in global repeat map algorithm
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3592446/
https://www.ncbi.nlm.nih.gov/pubmed/22977183
http://dx.doi.org/10.1093/nar/gks721
work_keys_str_mv AT gluncicmatko directmappingofsymbolicdnasequenceintofrequencydomaininglobalrepeatmapalgorithm
AT paarvladimir directmappingofsymbolicdnasequenceintofrequencydomaininglobalrepeatmapalgorithm