Cargando…
Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data
BACKGROUND: One objective of metagenomics is to reconstruct information about specific uncultured organisms from fragmentary environmental DNA sequences. We used the genome of an isolate of the marine alphaproteobacterium SAR11 ('Candidatus Pelagibacter ubique'; strain HTCC1062), obtained...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2217521/ https://www.ncbi.nlm.nih.gov/pubmed/17988398 http://dx.doi.org/10.1186/1745-6150-2-27 |
_version_ | 1782149271905107968 |
---|---|
author | Wilhelm, Larry J Tripp, H James Givan, Scott A Smith, Daniel P Giovannoni, Stephen J |
author_facet | Wilhelm, Larry J Tripp, H James Givan, Scott A Smith, Daniel P Giovannoni, Stephen J |
author_sort | Wilhelm, Larry J |
collection | PubMed |
description | BACKGROUND: One objective of metagenomics is to reconstruct information about specific uncultured organisms from fragmentary environmental DNA sequences. We used the genome of an isolate of the marine alphaproteobacterium SAR11 ('Candidatus Pelagibacter ubique'; strain HTCC1062), obtained from the cold, productive Oregon coast, as a query sequence to study variation in SAR11 metagenome sequence data from the Sargasso Sea, a warm, oligotrophic ocean gyre. RESULTS: The average amino acid identity of SAR11 genes encoded by the metagenomic data to the query genome was only 71%, indicating significant evolutionary divergence between the coastal isolates and Sargasso Sea populations. However, an analysis of gene neighbors indicated that SAR11 genes in the Sargasso Sea metagenomic data match the gene order of the HTCC1062 genome in 96% of cases (> 85,000 observations), and that rearrangements are most frequent at predicted operon boundaries. There were no conserved examples of genes with known functions being found in the coastal isolates, but not the Sargasso Sea metagenomic data, or vice versa, suggesting that core regions of these diverse SAR11 genomes are relatively conserved in gene content. However, four hypervariable regions were observed, which may encode properties associated with variation in SAR11 ecotypes. The largest of these, HVR2, is a 48 kb region flanked by the sole 5S and 23S genes in the HTCC1062 genome, and mainly encodes genes that determine cell surface properties. A comparison of two closely related 'Candidatus Pelagibacter' genomes (HTCC1062 and HTCC1002) revealed a number of "gene indels" in core regions. Most of these were found to be polymorphic in the metagenomic data and showed evidence of purifying selection, suggesting that the same "polymorphic gene indels" are maintained in physically isolated SAR11 populations. CONCLUSION: These findings suggest that natural selection has conserved many core features of SAR11 genomes across broad oceanic scales, but significant variation was found associated with four hypervariable genome regions. The data also led to the hypothesis that some gene insertions and deletions might be polymorphisms, similar to allelic polymorphisms. |
format | Text |
id | pubmed-2217521 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-22175212008-01-30 Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data Wilhelm, Larry J Tripp, H James Givan, Scott A Smith, Daniel P Giovannoni, Stephen J Biol Direct Research BACKGROUND: One objective of metagenomics is to reconstruct information about specific uncultured organisms from fragmentary environmental DNA sequences. We used the genome of an isolate of the marine alphaproteobacterium SAR11 ('Candidatus Pelagibacter ubique'; strain HTCC1062), obtained from the cold, productive Oregon coast, as a query sequence to study variation in SAR11 metagenome sequence data from the Sargasso Sea, a warm, oligotrophic ocean gyre. RESULTS: The average amino acid identity of SAR11 genes encoded by the metagenomic data to the query genome was only 71%, indicating significant evolutionary divergence between the coastal isolates and Sargasso Sea populations. However, an analysis of gene neighbors indicated that SAR11 genes in the Sargasso Sea metagenomic data match the gene order of the HTCC1062 genome in 96% of cases (> 85,000 observations), and that rearrangements are most frequent at predicted operon boundaries. There were no conserved examples of genes with known functions being found in the coastal isolates, but not the Sargasso Sea metagenomic data, or vice versa, suggesting that core regions of these diverse SAR11 genomes are relatively conserved in gene content. However, four hypervariable regions were observed, which may encode properties associated with variation in SAR11 ecotypes. The largest of these, HVR2, is a 48 kb region flanked by the sole 5S and 23S genes in the HTCC1062 genome, and mainly encodes genes that determine cell surface properties. A comparison of two closely related 'Candidatus Pelagibacter' genomes (HTCC1062 and HTCC1002) revealed a number of "gene indels" in core regions. Most of these were found to be polymorphic in the metagenomic data and showed evidence of purifying selection, suggesting that the same "polymorphic gene indels" are maintained in physically isolated SAR11 populations. CONCLUSION: These findings suggest that natural selection has conserved many core features of SAR11 genomes across broad oceanic scales, but significant variation was found associated with four hypervariable genome regions. The data also led to the hypothesis that some gene insertions and deletions might be polymorphisms, similar to allelic polymorphisms. BioMed Central 2007-11-07 /pmc/articles/PMC2217521/ /pubmed/17988398 http://dx.doi.org/10.1186/1745-6150-2-27 Text en Copyright © 2007 Wilhelm et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Wilhelm, Larry J Tripp, H James Givan, Scott A Smith, Daniel P Giovannoni, Stephen J Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data |
title | Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data |
title_full | Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data |
title_fullStr | Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data |
title_full_unstemmed | Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data |
title_short | Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data |
title_sort | natural variation in sar11 marine bacterioplankton genomes inferred from metagenomic data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2217521/ https://www.ncbi.nlm.nih.gov/pubmed/17988398 http://dx.doi.org/10.1186/1745-6150-2-27 |
work_keys_str_mv | AT wilhelmlarryj naturalvariationinsar11marinebacterioplanktongenomesinferredfrommetagenomicdata AT tripphjames naturalvariationinsar11marinebacterioplanktongenomesinferredfrommetagenomicdata AT givanscotta naturalvariationinsar11marinebacterioplanktongenomesinferredfrommetagenomicdata AT smithdanielp naturalvariationinsar11marinebacterioplanktongenomesinferredfrommetagenomicdata AT giovannonistephenj naturalvariationinsar11marinebacterioplanktongenomesinferredfrommetagenomicdata |