Cargando…

Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data

DNA assembly is a core methodological step in metagenomic pipelines used to study the structure and function within microbial communities. Here we investigate the utility of Pacific Biosciences long and high accuracy circular consensus sequencing (CCS) reads for metagenomic projects. We compared the...

Descripción completa

Detalles Bibliográficos
Autores principales: Frank, J. A., Pan, Y., Tooming-Klunderud, A., Eijsink, V. G. H., McHardy, A. C., Nederbragt, A. J., Pope, P. B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4860591/
https://www.ncbi.nlm.nih.gov/pubmed/27156482
http://dx.doi.org/10.1038/srep25373
_version_ 1782431092796555264
author Frank, J. A.
Pan, Y.
Tooming-Klunderud, A.
Eijsink, V. G. H.
McHardy, A. C.
Nederbragt, A. J.
Pope, P. B.
author_facet Frank, J. A.
Pan, Y.
Tooming-Klunderud, A.
Eijsink, V. G. H.
McHardy, A. C.
Nederbragt, A. J.
Pope, P. B.
author_sort Frank, J. A.
collection PubMed
description DNA assembly is a core methodological step in metagenomic pipelines used to study the structure and function within microbial communities. Here we investigate the utility of Pacific Biosciences long and high accuracy circular consensus sequencing (CCS) reads for metagenomic projects. We compared the application and performance of both PacBio CCS and Illumina HiSeq data with assembly and taxonomic binning algorithms using metagenomic samples representing a complex microbial community. Eight SMRT cells produced approximately 94 Mb of CCS reads from a biogas reactor microbiome sample that averaged 1319 nt in length and 99.7% accuracy. CCS data assembly generated a comparative number of large contigs greater than 1 kb, to those assembled from a ~190x larger HiSeq dataset (~18 Gb) produced from the same sample (i.e approximately 62% of total contigs). Hybrid assemblies using PacBio CCS and HiSeq contigs produced improvements in assembly statistics, including an increase in the average contig length and number of large contigs. The incorporation of CCS data produced significant enhancements in taxonomic binning and genome reconstruction of two dominant phylotypes, which assembled and binned poorly using HiSeq data alone. Collectively these results illustrate the value of PacBio CCS reads in certain metagenomics applications.
format Online
Article
Text
id pubmed-4860591
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-48605912016-05-20 Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data Frank, J. A. Pan, Y. Tooming-Klunderud, A. Eijsink, V. G. H. McHardy, A. C. Nederbragt, A. J. Pope, P. B. Sci Rep Article DNA assembly is a core methodological step in metagenomic pipelines used to study the structure and function within microbial communities. Here we investigate the utility of Pacific Biosciences long and high accuracy circular consensus sequencing (CCS) reads for metagenomic projects. We compared the application and performance of both PacBio CCS and Illumina HiSeq data with assembly and taxonomic binning algorithms using metagenomic samples representing a complex microbial community. Eight SMRT cells produced approximately 94 Mb of CCS reads from a biogas reactor microbiome sample that averaged 1319 nt in length and 99.7% accuracy. CCS data assembly generated a comparative number of large contigs greater than 1 kb, to those assembled from a ~190x larger HiSeq dataset (~18 Gb) produced from the same sample (i.e approximately 62% of total contigs). Hybrid assemblies using PacBio CCS and HiSeq contigs produced improvements in assembly statistics, including an increase in the average contig length and number of large contigs. The incorporation of CCS data produced significant enhancements in taxonomic binning and genome reconstruction of two dominant phylotypes, which assembled and binned poorly using HiSeq data alone. Collectively these results illustrate the value of PacBio CCS reads in certain metagenomics applications. Nature Publishing Group 2016-05-09 /pmc/articles/PMC4860591/ /pubmed/27156482 http://dx.doi.org/10.1038/srep25373 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Frank, J. A.
Pan, Y.
Tooming-Klunderud, A.
Eijsink, V. G. H.
McHardy, A. C.
Nederbragt, A. J.
Pope, P. B.
Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data
title Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data
title_full Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data
title_fullStr Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data
title_full_unstemmed Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data
title_short Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data
title_sort improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4860591/
https://www.ncbi.nlm.nih.gov/pubmed/27156482
http://dx.doi.org/10.1038/srep25373
work_keys_str_mv AT frankja improvedmetagenomeassembliesandtaxonomicbinningusinglongreadcircularconsensussequencedata
AT pany improvedmetagenomeassembliesandtaxonomicbinningusinglongreadcircularconsensussequencedata
AT toomingklunderuda improvedmetagenomeassembliesandtaxonomicbinningusinglongreadcircularconsensussequencedata
AT eijsinkvgh improvedmetagenomeassembliesandtaxonomicbinningusinglongreadcircularconsensussequencedata
AT mchardyac improvedmetagenomeassembliesandtaxonomicbinningusinglongreadcircularconsensussequencedata
AT nederbragtaj improvedmetagenomeassembliesandtaxonomicbinningusinglongreadcircularconsensussequencedata
AT popepb improvedmetagenomeassembliesandtaxonomicbinningusinglongreadcircularconsensussequencedata