Cargando…

Detection of Highly Divergent Tandem Repeats in the Rice Genome

Currently, there is a lack of bioinformatics approaches to identify highly divergent tandem repeats (TRs) in eukaryotic genomes. Here, we developed a new mathematical method to search for TRs, which uses a novel algorithm for constructing multiple alignments based on the generation of random positio...

Descripción completa

Detalles Bibliográficos
Autores principales: Korotkov, Eugene V., Kamionskya, Anastasiya M., Korotkova, Maria A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8064497/
https://www.ncbi.nlm.nih.gov/pubmed/33806152
http://dx.doi.org/10.3390/genes12040473
_version_ 1783682147902554112
author Korotkov, Eugene V.
Kamionskya, Anastasiya M.
Korotkova, Maria A.
author_facet Korotkov, Eugene V.
Kamionskya, Anastasiya M.
Korotkova, Maria A.
author_sort Korotkov, Eugene V.
collection PubMed
description Currently, there is a lack of bioinformatics approaches to identify highly divergent tandem repeats (TRs) in eukaryotic genomes. Here, we developed a new mathematical method to search for TRs, which uses a novel algorithm for constructing multiple alignments based on the generation of random position weight matrices (RPWMs), and applied it to detect TRs of 2 to 50 nucleotides long in the rice genome. The RPWM method could find highly divergent TRs in the presence of insertions or deletions. Comparison of the RPWM algorithm with the other methods of TR identification showed that RPWM could detect TRs in which the average number of base substitutions per nucleotide (x) was between 1.5 and 3.2, whereas T-REKS and TRF methods could not detect divergent TRs with x > 1.5. Applied to the search of TRs in the rice genome, the RPWM method revealed that TRs occupied 5% of the genome and that most of them were 2 and 3 bases long. Using RPWM, we also revealed the correlation of TRs with dispersed repeats and transposons, suggesting that some transposons originated from TRs. Thus, the novel RPWM algorithm is an effective tool to search for highly divergent TRs in the genomes.
format Online
Article
Text
id pubmed-8064497
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-80644972021-04-24 Detection of Highly Divergent Tandem Repeats in the Rice Genome Korotkov, Eugene V. Kamionskya, Anastasiya M. Korotkova, Maria A. Genes (Basel) Article Currently, there is a lack of bioinformatics approaches to identify highly divergent tandem repeats (TRs) in eukaryotic genomes. Here, we developed a new mathematical method to search for TRs, which uses a novel algorithm for constructing multiple alignments based on the generation of random position weight matrices (RPWMs), and applied it to detect TRs of 2 to 50 nucleotides long in the rice genome. The RPWM method could find highly divergent TRs in the presence of insertions or deletions. Comparison of the RPWM algorithm with the other methods of TR identification showed that RPWM could detect TRs in which the average number of base substitutions per nucleotide (x) was between 1.5 and 3.2, whereas T-REKS and TRF methods could not detect divergent TRs with x > 1.5. Applied to the search of TRs in the rice genome, the RPWM method revealed that TRs occupied 5% of the genome and that most of them were 2 and 3 bases long. Using RPWM, we also revealed the correlation of TRs with dispersed repeats and transposons, suggesting that some transposons originated from TRs. Thus, the novel RPWM algorithm is an effective tool to search for highly divergent TRs in the genomes. MDPI 2021-03-25 /pmc/articles/PMC8064497/ /pubmed/33806152 http://dx.doi.org/10.3390/genes12040473 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ).
spellingShingle Article
Korotkov, Eugene V.
Kamionskya, Anastasiya M.
Korotkova, Maria A.
Detection of Highly Divergent Tandem Repeats in the Rice Genome
title Detection of Highly Divergent Tandem Repeats in the Rice Genome
title_full Detection of Highly Divergent Tandem Repeats in the Rice Genome
title_fullStr Detection of Highly Divergent Tandem Repeats in the Rice Genome
title_full_unstemmed Detection of Highly Divergent Tandem Repeats in the Rice Genome
title_short Detection of Highly Divergent Tandem Repeats in the Rice Genome
title_sort detection of highly divergent tandem repeats in the rice genome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8064497/
https://www.ncbi.nlm.nih.gov/pubmed/33806152
http://dx.doi.org/10.3390/genes12040473
work_keys_str_mv AT korotkoveugenev detectionofhighlydivergenttandemrepeatsinthericegenome
AT kamionskyaanastasiyam detectionofhighlydivergenttandemrepeatsinthericegenome
AT korotkovamariaa detectionofhighlydivergenttandemrepeatsinthericegenome