Cargando…

A novel method for accurate operon predictions in all sequenced prokaryotes

We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia c...

Descripción completa

Detalles Bibliográficos
Autores principales: Price, Morgan N., Huang, Katherine H., Alm, Eric J., Arkin, Adam P.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC549399/
https://www.ncbi.nlm.nih.gov/pubmed/15701760
http://dx.doi.org/10.1093/nar/gki232
_version_ 1782122411806687232
author Price, Morgan N.
Huang, Katherine H.
Alm, Eric J.
Arkin, Adam P.
author_facet Price, Morgan N.
Huang, Katherine H.
Alm, Eric J.
Arkin, Adam P.
author_sort Price, Morgan N.
collection PubMed
description We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized transcripts. In Halobacterium NRC-1 and in Helicobacter pylori, our method correctly infers that genes in operons are separated by shorter distances than they are in E.coli, and its predictions using distance alone are more accurate than distance-only predictions trained on a database of E.coli transcripts. We use microarray data from six phylogenetically diverse prokaryotes to show that combining intergenic distance with comparative genomic measures further improves accuracy and that our method is broadly effective. Finally, we survey operon structure across 124 genomes, and find several surprises: H.pylori has many operons, contrary to previous reports; Bacillus anthracis has an unusual number of pseudogenes within conserved operons; and Synechocystis PCC 6803 has many operons even though it has unusually wide spacings between conserved adjacent genes.
format Text
id pubmed-549399
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-5493992005-02-24 A novel method for accurate operon predictions in all sequenced prokaryotes Price, Morgan N. Huang, Katherine H. Alm, Eric J. Arkin, Adam P. Nucleic Acids Res Article We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized transcripts. In Halobacterium NRC-1 and in Helicobacter pylori, our method correctly infers that genes in operons are separated by shorter distances than they are in E.coli, and its predictions using distance alone are more accurate than distance-only predictions trained on a database of E.coli transcripts. We use microarray data from six phylogenetically diverse prokaryotes to show that combining intergenic distance with comparative genomic measures further improves accuracy and that our method is broadly effective. Finally, we survey operon structure across 124 genomes, and find several surprises: H.pylori has many operons, contrary to previous reports; Bacillus anthracis has an unusual number of pseudogenes within conserved operons; and Synechocystis PCC 6803 has many operons even though it has unusually wide spacings between conserved adjacent genes. Oxford University Press 2005 2005-02-08 /pmc/articles/PMC549399/ /pubmed/15701760 http://dx.doi.org/10.1093/nar/gki232 Text en © The Author 2005. Published by Oxford University Press. All rights reserved
spellingShingle Article
Price, Morgan N.
Huang, Katherine H.
Alm, Eric J.
Arkin, Adam P.
A novel method for accurate operon predictions in all sequenced prokaryotes
title A novel method for accurate operon predictions in all sequenced prokaryotes
title_full A novel method for accurate operon predictions in all sequenced prokaryotes
title_fullStr A novel method for accurate operon predictions in all sequenced prokaryotes
title_full_unstemmed A novel method for accurate operon predictions in all sequenced prokaryotes
title_short A novel method for accurate operon predictions in all sequenced prokaryotes
title_sort novel method for accurate operon predictions in all sequenced prokaryotes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC549399/
https://www.ncbi.nlm.nih.gov/pubmed/15701760
http://dx.doi.org/10.1093/nar/gki232
work_keys_str_mv AT pricemorgann anovelmethodforaccurateoperonpredictionsinallsequencedprokaryotes
AT huangkatherineh anovelmethodforaccurateoperonpredictionsinallsequencedprokaryotes
AT almericj anovelmethodforaccurateoperonpredictionsinallsequencedprokaryotes
AT arkinadamp anovelmethodforaccurateoperonpredictionsinallsequencedprokaryotes
AT pricemorgann novelmethodforaccurateoperonpredictionsinallsequencedprokaryotes
AT huangkatherineh novelmethodforaccurateoperonpredictionsinallsequencedprokaryotes
AT almericj novelmethodforaccurateoperonpredictionsinallsequencedprokaryotes
AT arkinadamp novelmethodforaccurateoperonpredictionsinallsequencedprokaryotes