Cargando…

VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature

Microbes play fundamental roles in shaping natural ecosystem properties and functions, but do so under constraints imposed by their viral predators. However, studying viruses in nature can be challenging due to low biomass and the lack of universal gene markers. Though metagenomic short-read sequenc...

Descripción completa

Detalles Bibliográficos
Autores principales: Zablocki, Olivier, Michelsen, Michelle, Burris, Marie, Solonenko, Natalie, Warwick-Dugdale, Joanna, Ghosh, Romik, Pett-Ridge, Jennifer, Sullivan, Matthew B., Temperton, Ben
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8018248/
https://www.ncbi.nlm.nih.gov/pubmed/33850654
http://dx.doi.org/10.7717/peerj.11088
_version_ 1783674177533771776
author Zablocki, Olivier
Michelsen, Michelle
Burris, Marie
Solonenko, Natalie
Warwick-Dugdale, Joanna
Ghosh, Romik
Pett-Ridge, Jennifer
Sullivan, Matthew B.
Temperton, Ben
author_facet Zablocki, Olivier
Michelsen, Michelle
Burris, Marie
Solonenko, Natalie
Warwick-Dugdale, Joanna
Ghosh, Romik
Pett-Ridge, Jennifer
Sullivan, Matthew B.
Temperton, Ben
author_sort Zablocki, Olivier
collection PubMed
description Microbes play fundamental roles in shaping natural ecosystem properties and functions, but do so under constraints imposed by their viral predators. However, studying viruses in nature can be challenging due to low biomass and the lack of universal gene markers. Though metagenomic short-read sequencing has greatly improved our virus ecology toolkit—and revealed many critical ecosystem roles for viruses—microdiverse populations and fine-scale genomic traits are missed. Some of these microdiverse populations are abundant and the missed regions may be of interest for identifying selection pressures that underpin evolutionary constraints associated with hosts and environments. Though long-read sequencing promises complete virus genomes on single reads, it currently suffers from high DNA requirements and sequencing errors that limit accurate gene prediction. Here we introduce VirION2, an integrated short- and long-read metagenomic wet-lab and informatics pipeline that updates our previous method (VirION) to further enhance the utility of long-read viral metagenomics. Using a viral mock community, we first optimized laboratory protocols (polymerase choice, DNA shearing size, PCR cycling) to enable 76% longer reads (now median length of 6,965 bp) from 100-fold less input DNA (now 1 nanogram). Using a virome from a natural seawater sample, we compared viromes generated with VirION2 against other library preparation options (unamplified, original VirION, and short-read), and optimized downstream informatics for improved long-read error correction and assembly. VirION2 assemblies combined with short-read based data (‘enhanced’ viromes), provided significant improvements over VirION libraries in the recovery of longer and more complete viral genomes, and our optimized error-correction strategy using long- and short-read data achieved 99.97% accuracy. In the seawater virome, VirION2 assemblies captured 5,161 viral populations (including all of the virus populations observed in the other assemblies), 30% of which were uniquely assembled through inclusion of long-reads, and 22% of the top 10% most abundant virus populations derived from assembly of long-reads. Viral populations unique to VirION2 assemblies had significantly higher microdiversity means, which may explain why short-read virome approaches failed to capture them. These findings suggest the VirION2 sample prep and workflow can help researchers better investigate the virosphere, even from challenging low-biomass samples. Our new protocols are available to the research community on protocols.io as a ‘living document’ to facilitate dissemination of updates to keep pace with the rapid evolution of long-read sequencing technology.
format Online
Article
Text
id pubmed-8018248
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-80182482021-04-12 VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature Zablocki, Olivier Michelsen, Michelle Burris, Marie Solonenko, Natalie Warwick-Dugdale, Joanna Ghosh, Romik Pett-Ridge, Jennifer Sullivan, Matthew B. Temperton, Ben PeerJ Bioinformatics Microbes play fundamental roles in shaping natural ecosystem properties and functions, but do so under constraints imposed by their viral predators. However, studying viruses in nature can be challenging due to low biomass and the lack of universal gene markers. Though metagenomic short-read sequencing has greatly improved our virus ecology toolkit—and revealed many critical ecosystem roles for viruses—microdiverse populations and fine-scale genomic traits are missed. Some of these microdiverse populations are abundant and the missed regions may be of interest for identifying selection pressures that underpin evolutionary constraints associated with hosts and environments. Though long-read sequencing promises complete virus genomes on single reads, it currently suffers from high DNA requirements and sequencing errors that limit accurate gene prediction. Here we introduce VirION2, an integrated short- and long-read metagenomic wet-lab and informatics pipeline that updates our previous method (VirION) to further enhance the utility of long-read viral metagenomics. Using a viral mock community, we first optimized laboratory protocols (polymerase choice, DNA shearing size, PCR cycling) to enable 76% longer reads (now median length of 6,965 bp) from 100-fold less input DNA (now 1 nanogram). Using a virome from a natural seawater sample, we compared viromes generated with VirION2 against other library preparation options (unamplified, original VirION, and short-read), and optimized downstream informatics for improved long-read error correction and assembly. VirION2 assemblies combined with short-read based data (‘enhanced’ viromes), provided significant improvements over VirION libraries in the recovery of longer and more complete viral genomes, and our optimized error-correction strategy using long- and short-read data achieved 99.97% accuracy. In the seawater virome, VirION2 assemblies captured 5,161 viral populations (including all of the virus populations observed in the other assemblies), 30% of which were uniquely assembled through inclusion of long-reads, and 22% of the top 10% most abundant virus populations derived from assembly of long-reads. Viral populations unique to VirION2 assemblies had significantly higher microdiversity means, which may explain why short-read virome approaches failed to capture them. These findings suggest the VirION2 sample prep and workflow can help researchers better investigate the virosphere, even from challenging low-biomass samples. Our new protocols are available to the research community on protocols.io as a ‘living document’ to facilitate dissemination of updates to keep pace with the rapid evolution of long-read sequencing technology. PeerJ Inc. 2021-03-30 /pmc/articles/PMC8018248/ /pubmed/33850654 http://dx.doi.org/10.7717/peerj.11088 Text en ©2021 Zablocki et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Zablocki, Olivier
Michelsen, Michelle
Burris, Marie
Solonenko, Natalie
Warwick-Dugdale, Joanna
Ghosh, Romik
Pett-Ridge, Jennifer
Sullivan, Matthew B.
Temperton, Ben
VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature
title VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature
title_full VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature
title_fullStr VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature
title_full_unstemmed VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature
title_short VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature
title_sort virion2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8018248/
https://www.ncbi.nlm.nih.gov/pubmed/33850654
http://dx.doi.org/10.7717/peerj.11088
work_keys_str_mv AT zablockiolivier virion2ashortandlongreadsequencingandinformaticsworkflowtostudythegenomicdiversityofvirusesinnature
AT michelsenmichelle virion2ashortandlongreadsequencingandinformaticsworkflowtostudythegenomicdiversityofvirusesinnature
AT burrismarie virion2ashortandlongreadsequencingandinformaticsworkflowtostudythegenomicdiversityofvirusesinnature
AT solonenkonatalie virion2ashortandlongreadsequencingandinformaticsworkflowtostudythegenomicdiversityofvirusesinnature
AT warwickdugdalejoanna virion2ashortandlongreadsequencingandinformaticsworkflowtostudythegenomicdiversityofvirusesinnature
AT ghoshromik virion2ashortandlongreadsequencingandinformaticsworkflowtostudythegenomicdiversityofvirusesinnature
AT pettridgejennifer virion2ashortandlongreadsequencingandinformaticsworkflowtostudythegenomicdiversityofvirusesinnature
AT sullivanmatthewb virion2ashortandlongreadsequencingandinformaticsworkflowtostudythegenomicdiversityofvirusesinnature
AT tempertonben virion2ashortandlongreadsequencingandinformaticsworkflowtostudythegenomicdiversityofvirusesinnature