Cargando…

Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure

We have developed a de novo method for the identification of dispersed repeats based on the use of random position-weight matrices (PWMs) and an iterative procedure (IP). The created algorithm (IP method) allows detection of dispersed repeats for which the average number of substitutions between any...

Descripción completa

Detalles Bibliográficos
Autores principales: Korotkov, Eugene, Suvorova, Yulia, Kostenko, Dimitry, Korotkova, Maria
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10341722/
https://www.ncbi.nlm.nih.gov/pubmed/37446142
http://dx.doi.org/10.3390/ijms241310964
_version_ 1785072329477324800
author Korotkov, Eugene
Suvorova, Yulia
Kostenko, Dimitry
Korotkova, Maria
author_facet Korotkov, Eugene
Suvorova, Yulia
Kostenko, Dimitry
Korotkova, Maria
author_sort Korotkov, Eugene
collection PubMed
description We have developed a de novo method for the identification of dispersed repeats based on the use of random position-weight matrices (PWMs) and an iterative procedure (IP). The created algorithm (IP method) allows detection of dispersed repeats for which the average number of substitutions between any two repeats per nucleotide (x) is less than or equal to 1.5. We have shown that all previously developed methods and algorithms (RED, RECON, and some others) can only find dispersed repeats for x ≤ 1.0. We applied the IP method to find dispersed repeats in the genomes of E. coli and nine other bacterial species. We identify three families of approximately 1.09 × 10(6), 0.64 × 10(6), and 0.58 × 10(6) DNA bases, respectively, constituting almost 50% of the complete E. coli genome. The length of the repeats is in the range of 400 to 600 bp. Other analyzed bacterial genomes contain one to three families of dispersed repeats with a total number of 10(3) to 6 × 10(3) copies. The existence of such highly divergent repeats could be associated with the presence of a single-type triplet periodicity in various genes or with the packing of bacterial DNA into a nucleoid.
format Online
Article
Text
id pubmed-10341722
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-103417222023-07-14 Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure Korotkov, Eugene Suvorova, Yulia Kostenko, Dimitry Korotkova, Maria Int J Mol Sci Article We have developed a de novo method for the identification of dispersed repeats based on the use of random position-weight matrices (PWMs) and an iterative procedure (IP). The created algorithm (IP method) allows detection of dispersed repeats for which the average number of substitutions between any two repeats per nucleotide (x) is less than or equal to 1.5. We have shown that all previously developed methods and algorithms (RED, RECON, and some others) can only find dispersed repeats for x ≤ 1.0. We applied the IP method to find dispersed repeats in the genomes of E. coli and nine other bacterial species. We identify three families of approximately 1.09 × 10(6), 0.64 × 10(6), and 0.58 × 10(6) DNA bases, respectively, constituting almost 50% of the complete E. coli genome. The length of the repeats is in the range of 400 to 600 bp. Other analyzed bacterial genomes contain one to three families of dispersed repeats with a total number of 10(3) to 6 × 10(3) copies. The existence of such highly divergent repeats could be associated with the presence of a single-type triplet periodicity in various genes or with the packing of bacterial DNA into a nucleoid. MDPI 2023-06-30 /pmc/articles/PMC10341722/ /pubmed/37446142 http://dx.doi.org/10.3390/ijms241310964 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Korotkov, Eugene
Suvorova, Yulia
Kostenko, Dimitry
Korotkova, Maria
Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure
title Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure
title_full Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure
title_fullStr Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure
title_full_unstemmed Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure
title_short Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure
title_sort search for dispersed repeats in bacterial genomes using an iterative procedure
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10341722/
https://www.ncbi.nlm.nih.gov/pubmed/37446142
http://dx.doi.org/10.3390/ijms241310964
work_keys_str_mv AT korotkoveugene searchfordispersedrepeatsinbacterialgenomesusinganiterativeprocedure
AT suvorovayulia searchfordispersedrepeatsinbacterialgenomesusinganiterativeprocedure
AT kostenkodimitry searchfordispersedrepeatsinbacterialgenomesusinganiterativeprocedure
AT korotkovamaria searchfordispersedrepeatsinbacterialgenomesusinganiterativeprocedure