Cargando…

Fast and global detection of periodic sequence repeats in large genomic resources

Periodically repeating DNA and protein elements are involved in various important biological events including genomic evolution, gene regulation, protein complex formation, and immunity. Notably, the currently used genome editing tools such as ZFNs, TALENs, and CRISPRs are also all associated with p...

Descripción completa

Detalles Bibliográficos
Autores principales: Mori, Hideto, Evans-Yamamoto, Daniel, Ishiguro, Soh, Tomita, Masaru, Yachie, Nozomu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6344855/
https://www.ncbi.nlm.nih.gov/pubmed/30304510
http://dx.doi.org/10.1093/nar/gky890
_version_ 1783389486239973376
author Mori, Hideto
Evans-Yamamoto, Daniel
Ishiguro, Soh
Tomita, Masaru
Yachie, Nozomu
author_facet Mori, Hideto
Evans-Yamamoto, Daniel
Ishiguro, Soh
Tomita, Masaru
Yachie, Nozomu
author_sort Mori, Hideto
collection PubMed
description Periodically repeating DNA and protein elements are involved in various important biological events including genomic evolution, gene regulation, protein complex formation, and immunity. Notably, the currently used genome editing tools such as ZFNs, TALENs, and CRISPRs are also all associated with periodically repeating biomolecules of natural organisms. Despite the biological importance of periodically repeating sequences and the expectation that new genome editing modules could be discovered from such periodical repeats, no software that globally detects such structured elements in large genomic resources in a high-throughput and unsupervised manner has been developed. We developed new software, SPADE (Search for Patterned DNA Elements), that exhaustively explores periodic DNA and protein repeats from large-scale genomic datasets based on k-mer periodicity evaluation. With a simple constraint, sequence periodicity, SPADE captured reported genome-editing-associated sequences and other protein families involving repeating domains such as tetratricopeptide, ankyrin and WD40 repeats with better performance than the other software designed for limited sets of repetitive biomolecular sequences, suggesting the high potential of this software to contribute to the discovery of new biological events and new genome editing modules.
format Online
Article
Text
id pubmed-6344855
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63448552019-01-29 Fast and global detection of periodic sequence repeats in large genomic resources Mori, Hideto Evans-Yamamoto, Daniel Ishiguro, Soh Tomita, Masaru Yachie, Nozomu Nucleic Acids Res Methods Online Periodically repeating DNA and protein elements are involved in various important biological events including genomic evolution, gene regulation, protein complex formation, and immunity. Notably, the currently used genome editing tools such as ZFNs, TALENs, and CRISPRs are also all associated with periodically repeating biomolecules of natural organisms. Despite the biological importance of periodically repeating sequences and the expectation that new genome editing modules could be discovered from such periodical repeats, no software that globally detects such structured elements in large genomic resources in a high-throughput and unsupervised manner has been developed. We developed new software, SPADE (Search for Patterned DNA Elements), that exhaustively explores periodic DNA and protein repeats from large-scale genomic datasets based on k-mer periodicity evaluation. With a simple constraint, sequence periodicity, SPADE captured reported genome-editing-associated sequences and other protein families involving repeating domains such as tetratricopeptide, ankyrin and WD40 repeats with better performance than the other software designed for limited sets of repetitive biomolecular sequences, suggesting the high potential of this software to contribute to the discovery of new biological events and new genome editing modules. Oxford University Press 2019-01-25 2018-10-10 /pmc/articles/PMC6344855/ /pubmed/30304510 http://dx.doi.org/10.1093/nar/gky890 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Mori, Hideto
Evans-Yamamoto, Daniel
Ishiguro, Soh
Tomita, Masaru
Yachie, Nozomu
Fast and global detection of periodic sequence repeats in large genomic resources
title Fast and global detection of periodic sequence repeats in large genomic resources
title_full Fast and global detection of periodic sequence repeats in large genomic resources
title_fullStr Fast and global detection of periodic sequence repeats in large genomic resources
title_full_unstemmed Fast and global detection of periodic sequence repeats in large genomic resources
title_short Fast and global detection of periodic sequence repeats in large genomic resources
title_sort fast and global detection of periodic sequence repeats in large genomic resources
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6344855/
https://www.ncbi.nlm.nih.gov/pubmed/30304510
http://dx.doi.org/10.1093/nar/gky890
work_keys_str_mv AT morihideto fastandglobaldetectionofperiodicsequencerepeatsinlargegenomicresources
AT evansyamamotodaniel fastandglobaldetectionofperiodicsequencerepeatsinlargegenomicresources
AT ishigurosoh fastandglobaldetectionofperiodicsequencerepeatsinlargegenomicresources
AT tomitamasaru fastandglobaldetectionofperiodicsequencerepeatsinlargegenomicresources
AT yachienozomu fastandglobaldetectionofperiodicsequencerepeatsinlargegenomicresources