Cargando…

Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes

A new mathematical method for potential reading frameshift detection in protein-coding sequences (cds) was developed. The algorithm is adjusted to the triplet periodicity of each analysed sequence using dynamic programming and a genetic algorithm. This does not require any preliminary training. Usin...

Descripción completa

Detalles Bibliográficos
Autores principales: Suvorova, Y M, Korotkova, M A, Skryabin, K G, Korotkov, E V
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6476729/
https://www.ncbi.nlm.nih.gov/pubmed/30726896
http://dx.doi.org/10.1093/dnares/dsy046
_version_ 1783412920612290560
author Suvorova, Y M
Korotkova, M A
Skryabin, K G
Korotkov, E V
author_facet Suvorova, Y M
Korotkova, M A
Skryabin, K G
Korotkov, E V
author_sort Suvorova, Y M
collection PubMed
description A new mathematical method for potential reading frameshift detection in protein-coding sequences (cds) was developed. The algorithm is adjusted to the triplet periodicity of each analysed sequence using dynamic programming and a genetic algorithm. This does not require any preliminary training. Using the developed method, cds from the Arabidopsis thaliana genome were analysed. In total, the algorithm found 9,930 sequences containing one or more potential reading frameshift(s). This is ∼21% of all analysed sequences of the genome. The Type I and Type II error rates were estimated as 11% and 30%, respectively. Similar results were obtained for the genomes of Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Rattus norvegicus and Xenopus tropicalis. Also, the developed algorithm was tested on 17 bacterial genomes. We compared our results with the previously obtained data on the search for potential reading frameshifts in these genomes. This study discussed the possibility that the reading frameshift seems like a relatively frequently encountered mutation; and this mutation could participate in the creation of new genes and proteins.
format Online
Article
Text
id pubmed-6476729
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-64767292019-04-25 Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes Suvorova, Y M Korotkova, M A Skryabin, K G Korotkov, E V DNA Res Full Papers A new mathematical method for potential reading frameshift detection in protein-coding sequences (cds) was developed. The algorithm is adjusted to the triplet periodicity of each analysed sequence using dynamic programming and a genetic algorithm. This does not require any preliminary training. Using the developed method, cds from the Arabidopsis thaliana genome were analysed. In total, the algorithm found 9,930 sequences containing one or more potential reading frameshift(s). This is ∼21% of all analysed sequences of the genome. The Type I and Type II error rates were estimated as 11% and 30%, respectively. Similar results were obtained for the genomes of Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Rattus norvegicus and Xenopus tropicalis. Also, the developed algorithm was tested on 17 bacterial genomes. We compared our results with the previously obtained data on the search for potential reading frameshifts in these genomes. This study discussed the possibility that the reading frameshift seems like a relatively frequently encountered mutation; and this mutation could participate in the creation of new genes and proteins. Oxford University Press 2019-04 2019-02-04 /pmc/articles/PMC6476729/ /pubmed/30726896 http://dx.doi.org/10.1093/dnares/dsy046 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Kazusa DNA Research Institute. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Full Papers
Suvorova, Y M
Korotkova, M A
Skryabin, K G
Korotkov, E V
Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes
title Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes
title_full Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes
title_fullStr Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes
title_full_unstemmed Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes
title_short Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes
title_sort search for potential reading frameshifts in cds from arabidopsis thaliana and other genomes
topic Full Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6476729/
https://www.ncbi.nlm.nih.gov/pubmed/30726896
http://dx.doi.org/10.1093/dnares/dsy046
work_keys_str_mv AT suvorovaym searchforpotentialreadingframeshiftsincdsfromarabidopsisthalianaandothergenomes
AT korotkovama searchforpotentialreadingframeshiftsincdsfromarabidopsisthalianaandothergenomes
AT skryabinkg searchforpotentialreadingframeshiftsincdsfromarabidopsisthalianaandothergenomes
AT korotkovev searchforpotentialreadingframeshiftsincdsfromarabidopsisthalianaandothergenomes