Cargando…

Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes

BACKGROUND: Sequencing technologies have different biases, in single-genome sequencing and metagenomic sequencing; these can significantly affect ORFs recovery and the population distribution of a metagenome. In this paper we investigate how well different technologies represent information related...

Descripción completa

Detalles Bibliográficos
Autores principales: Gori, Fabio, Tringe, Susannah G, Folino, Gianluigi, van Hijum, Sacha AFT, Op den Camp, Huub JM, Jetten, Mike SM, Marchiori, Elena
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3618311/
https://www.ncbi.nlm.nih.gov/pubmed/23324532
http://dx.doi.org/10.1186/1471-2164-14-7
_version_ 1782265398010314752
author Gori, Fabio
Tringe, Susannah G
Folino, Gianluigi
van Hijum, Sacha AFT
Op den Camp, Huub JM
Jetten, Mike SM
Marchiori, Elena
author_facet Gori, Fabio
Tringe, Susannah G
Folino, Gianluigi
van Hijum, Sacha AFT
Op den Camp, Huub JM
Jetten, Mike SM
Marchiori, Elena
author_sort Gori, Fabio
collection PubMed
description BACKGROUND: Sequencing technologies have different biases, in single-genome sequencing and metagenomic sequencing; these can significantly affect ORFs recovery and the population distribution of a metagenome. In this paper we investigate how well different technologies represent information related to a considered organism of interest in a metagenome, and whether it is beneficial to combine information obtained using different technologies. We analyze comparatively three metagenomic datasets acquired from a sample containing the anammox bacterium Candidatus ’Brocadia fulgida’ (B. fulgida). These datasets were obtained using Roche 454 FLX and Sanger sequencing with two different libraries (shotgun and fosmid). RESULTS: In each dataset, the abundance of the reads annotated to B. fulgida was much lower than the abundance expected from available cell count information. This was due to the overrepresentation of GC-richer organisms, as shown by GC-content distribution of the reads. Nevertheless, by considering the union of B. fulgida reads over the three datasets, the number of B. fulgida ORFs recovered for at least 80% of their length was twice the amount recovered by the best technology. Indeed, while taxonomic distributions of reads in the three datasets were similar, the respective sets of B. fulgida ORFs recovered for a large part of their length were highly different, and depth of coverage patterns of 454 and Sanger were dissimilar. CONCLUSIONS: Precautions should be sought in order to prevent the overrepresentation of GC-rich microbes in the datasets. This overrepresentation and the consistency of the taxonomic distributions of reads obtained with different sequencing technologies suggests that, in general, abundance biases might be mainly due to other steps of the sequencing protocols. Results show that biases against organisms of interest could be compensated combining different sequencing technologies, due to the differences of their genome-level sequencing biases even if the species was present in not very different abundances in the metagenomes.
format Online
Article
Text
id pubmed-3618311
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36183112013-04-07 Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes Gori, Fabio Tringe, Susannah G Folino, Gianluigi van Hijum, Sacha AFT Op den Camp, Huub JM Jetten, Mike SM Marchiori, Elena BMC Genomics Research Article BACKGROUND: Sequencing technologies have different biases, in single-genome sequencing and metagenomic sequencing; these can significantly affect ORFs recovery and the population distribution of a metagenome. In this paper we investigate how well different technologies represent information related to a considered organism of interest in a metagenome, and whether it is beneficial to combine information obtained using different technologies. We analyze comparatively three metagenomic datasets acquired from a sample containing the anammox bacterium Candidatus ’Brocadia fulgida’ (B. fulgida). These datasets were obtained using Roche 454 FLX and Sanger sequencing with two different libraries (shotgun and fosmid). RESULTS: In each dataset, the abundance of the reads annotated to B. fulgida was much lower than the abundance expected from available cell count information. This was due to the overrepresentation of GC-richer organisms, as shown by GC-content distribution of the reads. Nevertheless, by considering the union of B. fulgida reads over the three datasets, the number of B. fulgida ORFs recovered for at least 80% of their length was twice the amount recovered by the best technology. Indeed, while taxonomic distributions of reads in the three datasets were similar, the respective sets of B. fulgida ORFs recovered for a large part of their length were highly different, and depth of coverage patterns of 454 and Sanger were dissimilar. CONCLUSIONS: Precautions should be sought in order to prevent the overrepresentation of GC-rich microbes in the datasets. This overrepresentation and the consistency of the taxonomic distributions of reads obtained with different sequencing technologies suggests that, in general, abundance biases might be mainly due to other steps of the sequencing protocols. Results show that biases against organisms of interest could be compensated combining different sequencing technologies, due to the differences of their genome-level sequencing biases even if the species was present in not very different abundances in the metagenomes. BioMed Central 2013-01-16 /pmc/articles/PMC3618311/ /pubmed/23324532 http://dx.doi.org/10.1186/1471-2164-14-7 Text en Copyright © 2013 Gori et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Gori, Fabio
Tringe, Susannah G
Folino, Gianluigi
van Hijum, Sacha AFT
Op den Camp, Huub JM
Jetten, Mike SM
Marchiori, Elena
Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes
title Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes
title_full Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes
title_fullStr Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes
title_full_unstemmed Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes
title_short Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes
title_sort differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3618311/
https://www.ncbi.nlm.nih.gov/pubmed/23324532
http://dx.doi.org/10.1186/1471-2164-14-7
work_keys_str_mv AT gorifabio differencesinsequencingtechnologiesimprovetheretrievalofanammoxbacterialgenomefrommetagenomes
AT tringesusannahg differencesinsequencingtechnologiesimprovetheretrievalofanammoxbacterialgenomefrommetagenomes
AT folinogianluigi differencesinsequencingtechnologiesimprovetheretrievalofanammoxbacterialgenomefrommetagenomes
AT vanhijumsachaaft differencesinsequencingtechnologiesimprovetheretrievalofanammoxbacterialgenomefrommetagenomes
AT opdencamphuubjm differencesinsequencingtechnologiesimprovetheretrievalofanammoxbacterialgenomefrommetagenomes
AT jettenmikesm differencesinsequencingtechnologiesimprovetheretrievalofanammoxbacterialgenomefrommetagenomes
AT marchiorielena differencesinsequencingtechnologiesimprovetheretrievalofanammoxbacterialgenomefrommetagenomes