Cargando…

Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing

BACKGROUND: Clinical implementation of Next-Generation Sequencing (NGS) is challenged by poor control for stochastic sampling, library preparation biases and qualitative sequencing error. To address these challenges we developed and tested two hypotheses. METHODS: Hypothesis 1: Analytical variation...

Descripción completa

Detalles Bibliográficos
Autores principales: Blomquist, Thomas, Crawford, Erin L., Yeo, Jiyoun, Zhang, Xiaolu, Willey, James C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4673681/
https://www.ncbi.nlm.nih.gov/pubmed/26693143
http://dx.doi.org/10.1016/j.bdq.2015.08.003
_version_ 1782404785787371520
author Blomquist, Thomas
Crawford, Erin L.
Yeo, Jiyoun
Zhang, Xiaolu
Willey, James C.
author_facet Blomquist, Thomas
Crawford, Erin L.
Yeo, Jiyoun
Zhang, Xiaolu
Willey, James C.
author_sort Blomquist, Thomas
collection PubMed
description BACKGROUND: Clinical implementation of Next-Generation Sequencing (NGS) is challenged by poor control for stochastic sampling, library preparation biases and qualitative sequencing error. To address these challenges we developed and tested two hypotheses. METHODS: Hypothesis 1: Analytical variation in quantification is predicted by stochastic sampling effects at input of (a) amplifiable nucleic acid target molecules into the library preparation, (b) amplicons from library into sequencer, or (c) both. We derived equations using Monte Carlo simulation to predict assay coefficient of variation (CV) based on these three working models and tested them against NGS data from specimens with well characterized molecule inputs and sequence counts prepared using competitive multiplex-PCR amplicon-based NGS library preparation method comprising synthetic internal standards (IS). Hypothesis 2: Frequencies of technically-derived qualitative sequencing errors (i.e., base substitution, insertion and deletion) observed at each base position in each target native template (NT) are concordant with those observed in respective competitive synthetic IS present in the same reaction. We measured error frequencies at each base position within amplicons from each of 30 target NT, then tested whether they correspond to those within the 30 respective IS. RESULTS: For hypothesis 1, the Monte Carlo model derived from both sampling events best predicted CV and explained 74% of observed assay variance. For hypothesis 2, observed frequency and type of sequence variation at each base position within each IS was concordant with that observed in respective NTs (R(2) = 0.93). CONCLUSION: In targeted NGS, synthetic competitive IS control for stochastic sampling at input of both target into library preparation and of target library product into sequencer, and control for qualitative errors generated during library preparation and sequencing. These controls enable accurate clinical diagnostic reporting of confidence limits and limit of detection for copy number measurement, and of frequency for each actionable mutation.
format Online
Article
Text
id pubmed-4673681
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-46736812016-04-13 Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing Blomquist, Thomas Crawford, Erin L. Yeo, Jiyoun Zhang, Xiaolu Willey, James C. Biomol Detect Quantif Research Paper BACKGROUND: Clinical implementation of Next-Generation Sequencing (NGS) is challenged by poor control for stochastic sampling, library preparation biases and qualitative sequencing error. To address these challenges we developed and tested two hypotheses. METHODS: Hypothesis 1: Analytical variation in quantification is predicted by stochastic sampling effects at input of (a) amplifiable nucleic acid target molecules into the library preparation, (b) amplicons from library into sequencer, or (c) both. We derived equations using Monte Carlo simulation to predict assay coefficient of variation (CV) based on these three working models and tested them against NGS data from specimens with well characterized molecule inputs and sequence counts prepared using competitive multiplex-PCR amplicon-based NGS library preparation method comprising synthetic internal standards (IS). Hypothesis 2: Frequencies of technically-derived qualitative sequencing errors (i.e., base substitution, insertion and deletion) observed at each base position in each target native template (NT) are concordant with those observed in respective competitive synthetic IS present in the same reaction. We measured error frequencies at each base position within amplicons from each of 30 target NT, then tested whether they correspond to those within the 30 respective IS. RESULTS: For hypothesis 1, the Monte Carlo model derived from both sampling events best predicted CV and explained 74% of observed assay variance. For hypothesis 2, observed frequency and type of sequence variation at each base position within each IS was concordant with that observed in respective NTs (R(2) = 0.93). CONCLUSION: In targeted NGS, synthetic competitive IS control for stochastic sampling at input of both target into library preparation and of target library product into sequencer, and control for qualitative errors generated during library preparation and sequencing. These controls enable accurate clinical diagnostic reporting of confidence limits and limit of detection for copy number measurement, and of frequency for each actionable mutation. Elsevier 2015-08-28 /pmc/articles/PMC4673681/ /pubmed/26693143 http://dx.doi.org/10.1016/j.bdq.2015.08.003 Text en http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Paper
Blomquist, Thomas
Crawford, Erin L.
Yeo, Jiyoun
Zhang, Xiaolu
Willey, James C.
Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing
title Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing
title_full Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing
title_fullStr Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing
title_full_unstemmed Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing
title_short Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing
title_sort control for stochastic sampling variation and qualitative sequencing error in next generation sequencing
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4673681/
https://www.ncbi.nlm.nih.gov/pubmed/26693143
http://dx.doi.org/10.1016/j.bdq.2015.08.003
work_keys_str_mv AT blomquistthomas controlforstochasticsamplingvariationandqualitativesequencingerrorinnextgenerationsequencing
AT crawforderinl controlforstochasticsamplingvariationandqualitativesequencingerrorinnextgenerationsequencing
AT yeojiyoun controlforstochasticsamplingvariationandqualitativesequencingerrorinnextgenerationsequencing
AT zhangxiaolu controlforstochasticsamplingvariationandqualitativesequencingerrorinnextgenerationsequencing
AT willeyjamesc controlforstochasticsamplingvariationandqualitativesequencingerrorinnextgenerationsequencing