Cargando…
Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm
The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3592446/ https://www.ncbi.nlm.nih.gov/pubmed/22977183 http://dx.doi.org/10.1093/nar/gks721 |
_version_ | 1782262117838094336 |
---|---|
author | Glunčić, Matko Paar, Vladimir |
author_facet | Glunčić, Matko Paar, Vladimir |
author_sort | Glunčić, Matko |
collection | PubMed |
description | The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes). |
format | Online Article Text |
id | pubmed-3592446 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-35924462013-03-08 Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm Glunčić, Matko Paar, Vladimir Nucleic Acids Res Methods Online The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes). Oxford University Press 2013-01 2012-09-12 /pmc/articles/PMC3592446/ /pubmed/22977183 http://dx.doi.org/10.1093/nar/gks721 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Online Glunčić, Matko Paar, Vladimir Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm |
title | Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm |
title_full | Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm |
title_fullStr | Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm |
title_full_unstemmed | Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm |
title_short | Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm |
title_sort | direct mapping of symbolic dna sequence into frequency domain in global repeat map algorithm |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3592446/ https://www.ncbi.nlm.nih.gov/pubmed/22977183 http://dx.doi.org/10.1093/nar/gks721 |
work_keys_str_mv | AT gluncicmatko directmappingofsymbolicdnasequenceintofrequencydomaininglobalrepeatmapalgorithm AT paarvladimir directmappingofsymbolicdnasequenceintofrequencydomaininglobalrepeatmapalgorithm |