Cargando…

Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data

BACKGROUND: One objective of metagenomics is to reconstruct information about specific uncultured organisms from fragmentary environmental DNA sequences. We used the genome of an isolate of the marine alphaproteobacterium SAR11 ('Candidatus Pelagibacter ubique'; strain HTCC1062), obtained...

Descripción completa

Detalles Bibliográficos
Autores principales: Wilhelm, Larry J, Tripp, H James, Givan, Scott A, Smith, Daniel P, Giovannoni, Stephen J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2217521/
https://www.ncbi.nlm.nih.gov/pubmed/17988398
http://dx.doi.org/10.1186/1745-6150-2-27
_version_ 1782149271905107968
author Wilhelm, Larry J
Tripp, H James
Givan, Scott A
Smith, Daniel P
Giovannoni, Stephen J
author_facet Wilhelm, Larry J
Tripp, H James
Givan, Scott A
Smith, Daniel P
Giovannoni, Stephen J
author_sort Wilhelm, Larry J
collection PubMed
description BACKGROUND: One objective of metagenomics is to reconstruct information about specific uncultured organisms from fragmentary environmental DNA sequences. We used the genome of an isolate of the marine alphaproteobacterium SAR11 ('Candidatus Pelagibacter ubique'; strain HTCC1062), obtained from the cold, productive Oregon coast, as a query sequence to study variation in SAR11 metagenome sequence data from the Sargasso Sea, a warm, oligotrophic ocean gyre. RESULTS: The average amino acid identity of SAR11 genes encoded by the metagenomic data to the query genome was only 71%, indicating significant evolutionary divergence between the coastal isolates and Sargasso Sea populations. However, an analysis of gene neighbors indicated that SAR11 genes in the Sargasso Sea metagenomic data match the gene order of the HTCC1062 genome in 96% of cases (> 85,000 observations), and that rearrangements are most frequent at predicted operon boundaries. There were no conserved examples of genes with known functions being found in the coastal isolates, but not the Sargasso Sea metagenomic data, or vice versa, suggesting that core regions of these diverse SAR11 genomes are relatively conserved in gene content. However, four hypervariable regions were observed, which may encode properties associated with variation in SAR11 ecotypes. The largest of these, HVR2, is a 48 kb region flanked by the sole 5S and 23S genes in the HTCC1062 genome, and mainly encodes genes that determine cell surface properties. A comparison of two closely related 'Candidatus Pelagibacter' genomes (HTCC1062 and HTCC1002) revealed a number of "gene indels" in core regions. Most of these were found to be polymorphic in the metagenomic data and showed evidence of purifying selection, suggesting that the same "polymorphic gene indels" are maintained in physically isolated SAR11 populations. CONCLUSION: These findings suggest that natural selection has conserved many core features of SAR11 genomes across broad oceanic scales, but significant variation was found associated with four hypervariable genome regions. The data also led to the hypothesis that some gene insertions and deletions might be polymorphisms, similar to allelic polymorphisms.
format Text
id pubmed-2217521
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22175212008-01-30 Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data Wilhelm, Larry J Tripp, H James Givan, Scott A Smith, Daniel P Giovannoni, Stephen J Biol Direct Research BACKGROUND: One objective of metagenomics is to reconstruct information about specific uncultured organisms from fragmentary environmental DNA sequences. We used the genome of an isolate of the marine alphaproteobacterium SAR11 ('Candidatus Pelagibacter ubique'; strain HTCC1062), obtained from the cold, productive Oregon coast, as a query sequence to study variation in SAR11 metagenome sequence data from the Sargasso Sea, a warm, oligotrophic ocean gyre. RESULTS: The average amino acid identity of SAR11 genes encoded by the metagenomic data to the query genome was only 71%, indicating significant evolutionary divergence between the coastal isolates and Sargasso Sea populations. However, an analysis of gene neighbors indicated that SAR11 genes in the Sargasso Sea metagenomic data match the gene order of the HTCC1062 genome in 96% of cases (> 85,000 observations), and that rearrangements are most frequent at predicted operon boundaries. There were no conserved examples of genes with known functions being found in the coastal isolates, but not the Sargasso Sea metagenomic data, or vice versa, suggesting that core regions of these diverse SAR11 genomes are relatively conserved in gene content. However, four hypervariable regions were observed, which may encode properties associated with variation in SAR11 ecotypes. The largest of these, HVR2, is a 48 kb region flanked by the sole 5S and 23S genes in the HTCC1062 genome, and mainly encodes genes that determine cell surface properties. A comparison of two closely related 'Candidatus Pelagibacter' genomes (HTCC1062 and HTCC1002) revealed a number of "gene indels" in core regions. Most of these were found to be polymorphic in the metagenomic data and showed evidence of purifying selection, suggesting that the same "polymorphic gene indels" are maintained in physically isolated SAR11 populations. CONCLUSION: These findings suggest that natural selection has conserved many core features of SAR11 genomes across broad oceanic scales, but significant variation was found associated with four hypervariable genome regions. The data also led to the hypothesis that some gene insertions and deletions might be polymorphisms, similar to allelic polymorphisms. BioMed Central 2007-11-07 /pmc/articles/PMC2217521/ /pubmed/17988398 http://dx.doi.org/10.1186/1745-6150-2-27 Text en Copyright © 2007 Wilhelm et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Wilhelm, Larry J
Tripp, H James
Givan, Scott A
Smith, Daniel P
Giovannoni, Stephen J
Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data
title Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data
title_full Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data
title_fullStr Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data
title_full_unstemmed Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data
title_short Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data
title_sort natural variation in sar11 marine bacterioplankton genomes inferred from metagenomic data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2217521/
https://www.ncbi.nlm.nih.gov/pubmed/17988398
http://dx.doi.org/10.1186/1745-6150-2-27
work_keys_str_mv AT wilhelmlarryj naturalvariationinsar11marinebacterioplanktongenomesinferredfrommetagenomicdata
AT tripphjames naturalvariationinsar11marinebacterioplanktongenomesinferredfrommetagenomicdata
AT givanscotta naturalvariationinsar11marinebacterioplanktongenomesinferredfrommetagenomicdata
AT smithdanielp naturalvariationinsar11marinebacterioplanktongenomesinferredfrommetagenomicdata
AT giovannonistephenj naturalvariationinsar11marinebacterioplanktongenomesinferredfrommetagenomicdata