Cargando…
Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes
A new mathematical method for potential reading frameshift detection in protein-coding sequences (cds) was developed. The algorithm is adjusted to the triplet periodicity of each analysed sequence using dynamic programming and a genetic algorithm. This does not require any preliminary training. Usin...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6476729/ https://www.ncbi.nlm.nih.gov/pubmed/30726896 http://dx.doi.org/10.1093/dnares/dsy046 |
_version_ | 1783412920612290560 |
---|---|
author | Suvorova, Y M Korotkova, M A Skryabin, K G Korotkov, E V |
author_facet | Suvorova, Y M Korotkova, M A Skryabin, K G Korotkov, E V |
author_sort | Suvorova, Y M |
collection | PubMed |
description | A new mathematical method for potential reading frameshift detection in protein-coding sequences (cds) was developed. The algorithm is adjusted to the triplet periodicity of each analysed sequence using dynamic programming and a genetic algorithm. This does not require any preliminary training. Using the developed method, cds from the Arabidopsis thaliana genome were analysed. In total, the algorithm found 9,930 sequences containing one or more potential reading frameshift(s). This is ∼21% of all analysed sequences of the genome. The Type I and Type II error rates were estimated as 11% and 30%, respectively. Similar results were obtained for the genomes of Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Rattus norvegicus and Xenopus tropicalis. Also, the developed algorithm was tested on 17 bacterial genomes. We compared our results with the previously obtained data on the search for potential reading frameshifts in these genomes. This study discussed the possibility that the reading frameshift seems like a relatively frequently encountered mutation; and this mutation could participate in the creation of new genes and proteins. |
format | Online Article Text |
id | pubmed-6476729 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-64767292019-04-25 Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes Suvorova, Y M Korotkova, M A Skryabin, K G Korotkov, E V DNA Res Full Papers A new mathematical method for potential reading frameshift detection in protein-coding sequences (cds) was developed. The algorithm is adjusted to the triplet periodicity of each analysed sequence using dynamic programming and a genetic algorithm. This does not require any preliminary training. Using the developed method, cds from the Arabidopsis thaliana genome were analysed. In total, the algorithm found 9,930 sequences containing one or more potential reading frameshift(s). This is ∼21% of all analysed sequences of the genome. The Type I and Type II error rates were estimated as 11% and 30%, respectively. Similar results were obtained for the genomes of Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Rattus norvegicus and Xenopus tropicalis. Also, the developed algorithm was tested on 17 bacterial genomes. We compared our results with the previously obtained data on the search for potential reading frameshifts in these genomes. This study discussed the possibility that the reading frameshift seems like a relatively frequently encountered mutation; and this mutation could participate in the creation of new genes and proteins. Oxford University Press 2019-04 2019-02-04 /pmc/articles/PMC6476729/ /pubmed/30726896 http://dx.doi.org/10.1093/dnares/dsy046 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of Kazusa DNA Research Institute. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Full Papers Suvorova, Y M Korotkova, M A Skryabin, K G Korotkov, E V Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes |
title | Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes |
title_full | Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes |
title_fullStr | Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes |
title_full_unstemmed | Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes |
title_short | Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes |
title_sort | search for potential reading frameshifts in cds from arabidopsis thaliana and other genomes |
topic | Full Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6476729/ https://www.ncbi.nlm.nih.gov/pubmed/30726896 http://dx.doi.org/10.1093/dnares/dsy046 |
work_keys_str_mv | AT suvorovaym searchforpotentialreadingframeshiftsincdsfromarabidopsisthalianaandothergenomes AT korotkovama searchforpotentialreadingframeshiftsincdsfromarabidopsisthalianaandothergenomes AT skryabinkg searchforpotentialreadingframeshiftsincdsfromarabidopsisthalianaandothergenomes AT korotkovev searchforpotentialreadingframeshiftsincdsfromarabidopsisthalianaandothergenomes |