Cargando…

Predicting Statistical Properties of Open Reading Frames in Bacterial Genomes

An analytical model based on the statistical properties of Open Reading Frames (ORFs) of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of...

Descripción completa

Detalles Bibliográficos
Autores principales: Mir, Katharina, Neuhaus, Klaus, Scherer, Siegfried, Bossert, Martin, Schober, Steffen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3454372/
https://www.ncbi.nlm.nih.gov/pubmed/23028785
http://dx.doi.org/10.1371/journal.pone.0045103
_version_ 1782244489695330304
author Mir, Katharina
Neuhaus, Klaus
Scherer, Siegfried
Bossert, Martin
Schober, Steffen
author_facet Mir, Katharina
Neuhaus, Klaus
Scherer, Siegfried
Bossert, Martin
Schober, Steffen
author_sort Mir, Katharina
collection PubMed
description An analytical model based on the statistical properties of Open Reading Frames (ORFs) of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments.
format Online
Article
Text
id pubmed-3454372
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34543722012-10-01 Predicting Statistical Properties of Open Reading Frames in Bacterial Genomes Mir, Katharina Neuhaus, Klaus Scherer, Siegfried Bossert, Martin Schober, Steffen PLoS One Research Article An analytical model based on the statistical properties of Open Reading Frames (ORFs) of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments. Public Library of Science 2012-09-24 /pmc/articles/PMC3454372/ /pubmed/23028785 http://dx.doi.org/10.1371/journal.pone.0045103 Text en © 2012 Mir et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Mir, Katharina
Neuhaus, Klaus
Scherer, Siegfried
Bossert, Martin
Schober, Steffen
Predicting Statistical Properties of Open Reading Frames in Bacterial Genomes
title Predicting Statistical Properties of Open Reading Frames in Bacterial Genomes
title_full Predicting Statistical Properties of Open Reading Frames in Bacterial Genomes
title_fullStr Predicting Statistical Properties of Open Reading Frames in Bacterial Genomes
title_full_unstemmed Predicting Statistical Properties of Open Reading Frames in Bacterial Genomes
title_short Predicting Statistical Properties of Open Reading Frames in Bacterial Genomes
title_sort predicting statistical properties of open reading frames in bacterial genomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3454372/
https://www.ncbi.nlm.nih.gov/pubmed/23028785
http://dx.doi.org/10.1371/journal.pone.0045103
work_keys_str_mv AT mirkatharina predictingstatisticalpropertiesofopenreadingframesinbacterialgenomes
AT neuhausklaus predictingstatisticalpropertiesofopenreadingframesinbacterialgenomes
AT scherersiegfried predictingstatisticalpropertiesofopenreadingframesinbacterialgenomes
AT bossertmartin predictingstatisticalpropertiesofopenreadingframesinbacterialgenomes
AT schobersteffen predictingstatisticalpropertiesofopenreadingframesinbacterialgenomes