Cargando…

Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

BACKGROUND: The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often be...

Descripción completa

Detalles Bibliográficos
Autores principales: Lan, Yemin, Rosen, Gail, Hershberg, Ruth
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4853863/
https://www.ncbi.nlm.nih.gov/pubmed/27138046
http://dx.doi.org/10.1186/s40168-016-0162-5
_version_ 1782430137748291584
author Lan, Yemin
Rosen, Gail
Hershberg, Ruth
author_facet Lan, Yemin
Rosen, Gail
Hershberg, Ruth
author_sort Lan, Yemin
collection PubMed
description BACKGROUND: The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. RESULTS: In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that the percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. CONCLUSIONS: Our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-016-0162-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4853863
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48538632016-05-04 Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains Lan, Yemin Rosen, Gail Hershberg, Ruth Microbiome Research BACKGROUND: The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. RESULTS: In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that the percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. CONCLUSIONS: Our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-016-0162-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-05-03 /pmc/articles/PMC4853863/ /pubmed/27138046 http://dx.doi.org/10.1186/s40168-016-0162-5 Text en © Lan et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Lan, Yemin
Rosen, Gail
Hershberg, Ruth
Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains
title Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains
title_full Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains
title_fullStr Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains
title_full_unstemmed Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains
title_short Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains
title_sort marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4853863/
https://www.ncbi.nlm.nih.gov/pubmed/27138046
http://dx.doi.org/10.1186/s40168-016-0162-5
work_keys_str_mv AT lanyemin markergenesthatarelessconservedintheirsequencesareusefulforpredictinggenomewidesimilaritylevelsbetweencloselyrelatedprokaryoticstrains
AT rosengail markergenesthatarelessconservedintheirsequencesareusefulforpredictinggenomewidesimilaritylevelsbetweencloselyrelatedprokaryoticstrains
AT hershbergruth markergenesthatarelessconservedintheirsequencesareusefulforpredictinggenomewidesimilaritylevelsbetweencloselyrelatedprokaryoticstrains