Cargando…

Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies

Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing which can require extensive optimizations to overcome. We prese...

Descripción completa

Detalles Bibliográficos
Autores principales: Westfall, Dylan H., Deng, Wenjie, Pankow, Alec, Murrell, Hugh, Chen, Lennie, Zhao, Hong, Williamson, Carolyn, Rolland, Morgane, Murrell, Ben, Mullins, James I.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9980183/
https://www.ncbi.nlm.nih.gov/pubmed/36865215
http://dx.doi.org/10.1101/2023.02.23.529831
_version_ 1784899863572381696
author Westfall, Dylan H.
Deng, Wenjie
Pankow, Alec
Murrell, Hugh
Chen, Lennie
Zhao, Hong
Williamson, Carolyn
Rolland, Morgane
Murrell, Ben
Mullins, James I.
author_facet Westfall, Dylan H.
Deng, Wenjie
Pankow, Alec
Murrell, Hugh
Chen, Lennie
Zhao, Hong
Williamson, Carolyn
Rolland, Morgane
Murrell, Ben
Mullins, James I.
author_sort Westfall, Dylan H.
collection PubMed
description Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence PCR amplicons derived from cDNA templates tagged with universal molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR and the use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Handling of the large datasets produced from SMRT-UMI sequencing was facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline), that automatically filters and parses reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination or early cycle PCR errors, resulting in highly accurate sequence datasets. The optimized SMRT-UMI sequencing method presented here represents a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus (HIV) quasispecies.
format Online
Article
Text
id pubmed-9980183
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-99801832023-03-03 Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies Westfall, Dylan H. Deng, Wenjie Pankow, Alec Murrell, Hugh Chen, Lennie Zhao, Hong Williamson, Carolyn Rolland, Morgane Murrell, Ben Mullins, James I. bioRxiv Article Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence PCR amplicons derived from cDNA templates tagged with universal molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR and the use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Handling of the large datasets produced from SMRT-UMI sequencing was facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline), that automatically filters and parses reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination or early cycle PCR errors, resulting in highly accurate sequence datasets. The optimized SMRT-UMI sequencing method presented here represents a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus (HIV) quasispecies. Cold Spring Harbor Laboratory 2023-02-24 /pmc/articles/PMC9980183/ /pubmed/36865215 http://dx.doi.org/10.1101/2023.02.23.529831 Text en https://creativecommons.org/publicdomain/zero/1.0/This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license (https://creativecommons.org/publicdomain/zero/1.0/) .
spellingShingle Article
Westfall, Dylan H.
Deng, Wenjie
Pankow, Alec
Murrell, Hugh
Chen, Lennie
Zhao, Hong
Williamson, Carolyn
Rolland, Morgane
Murrell, Ben
Mullins, James I.
Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies
title Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies
title_full Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies
title_fullStr Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies
title_full_unstemmed Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies
title_short Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations – application to HIV-1 quasispecies
title_sort optimized smrt-umi protocol produces highly accurate sequence datasets from diverse populations – application to hiv-1 quasispecies
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9980183/
https://www.ncbi.nlm.nih.gov/pubmed/36865215
http://dx.doi.org/10.1101/2023.02.23.529831
work_keys_str_mv AT westfalldylanh optimizedsmrtumiprotocolproduceshighlyaccuratesequencedatasetsfromdiversepopulationsapplicationtohiv1quasispecies
AT dengwenjie optimizedsmrtumiprotocolproduceshighlyaccuratesequencedatasetsfromdiversepopulationsapplicationtohiv1quasispecies
AT pankowalec optimizedsmrtumiprotocolproduceshighlyaccuratesequencedatasetsfromdiversepopulationsapplicationtohiv1quasispecies
AT murrellhugh optimizedsmrtumiprotocolproduceshighlyaccuratesequencedatasetsfromdiversepopulationsapplicationtohiv1quasispecies
AT chenlennie optimizedsmrtumiprotocolproduceshighlyaccuratesequencedatasetsfromdiversepopulationsapplicationtohiv1quasispecies
AT zhaohong optimizedsmrtumiprotocolproduceshighlyaccuratesequencedatasetsfromdiversepopulationsapplicationtohiv1quasispecies
AT williamsoncarolyn optimizedsmrtumiprotocolproduceshighlyaccuratesequencedatasetsfromdiversepopulationsapplicationtohiv1quasispecies
AT rollandmorgane optimizedsmrtumiprotocolproduceshighlyaccuratesequencedatasetsfromdiversepopulationsapplicationtohiv1quasispecies
AT murrellben optimizedsmrtumiprotocolproduceshighlyaccuratesequencedatasetsfromdiversepopulationsapplicationtohiv1quasispecies
AT mullinsjamesi optimizedsmrtumiprotocolproduceshighlyaccuratesequencedatasetsfromdiversepopulationsapplicationtohiv1quasispecies