Cargando…

Challenges with Using Primer IDs to Improve Accuracy of Next Generation Sequencing

Next generation sequencing technologies, like ultra-deep pyrosequencing (UDPS), allows detailed investigation of complex populations, like RNA viruses, but its utility is limited by errors introduced during sample preparation and sequencing. By tagging each individual cDNA molecule with barcodes, re...

Descripción completa

Detalles Bibliográficos
Autores principales: Brodin, Johanna, Hedskog, Charlotte, Heddini, Alexander, Benard, Emmanuel, Neher, Richard A., Mild, Mattias, Albert, Jan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351057/
https://www.ncbi.nlm.nih.gov/pubmed/25741706
http://dx.doi.org/10.1371/journal.pone.0119123
_version_ 1782360280756387840
author Brodin, Johanna
Hedskog, Charlotte
Heddini, Alexander
Benard, Emmanuel
Neher, Richard A.
Mild, Mattias
Albert, Jan
author_facet Brodin, Johanna
Hedskog, Charlotte
Heddini, Alexander
Benard, Emmanuel
Neher, Richard A.
Mild, Mattias
Albert, Jan
author_sort Brodin, Johanna
collection PubMed
description Next generation sequencing technologies, like ultra-deep pyrosequencing (UDPS), allows detailed investigation of complex populations, like RNA viruses, but its utility is limited by errors introduced during sample preparation and sequencing. By tagging each individual cDNA molecule with barcodes, referred to as Primer IDs, before PCR and sequencing these errors could theoretically be removed. Here we evaluated the Primer ID methodology on 257,846 UDPS reads generated from a HIV-1 SG3Δenv plasmid clone and plasma samples from three HIV-infected patients. The Primer ID consisted of 11 randomized nucleotides, 4,194,304 combinations, in the primer for cDNA synthesis that introduced a unique sequence tag into each cDNA molecule. Consensus template sequences were constructed for reads with Primer IDs that were observed three or more times. Despite high numbers of input template molecules, the number of consensus template sequences was low. With 10,000 input molecules for the clone as few as 97 consensus template sequences were obtained due to highly skewed frequency of resampling. Furthermore, the number of sequenced templates was overestimated due to PCR errors in the Primer IDs. Finally, some consensus template sequences were erroneous due to hotspots for UDPS errors. The Primer ID methodology has the potential to provide highly accurate deep sequencing. However, it is important to be aware that there are remaining challenges with the methodology. In particular it is important to find ways to obtain a more even frequency of resampling of template molecules as well as to identify and remove artefactual consensus template sequences that have been generated by PCR errors in the Primer IDs.
format Online
Article
Text
id pubmed-4351057
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43510572015-03-17 Challenges with Using Primer IDs to Improve Accuracy of Next Generation Sequencing Brodin, Johanna Hedskog, Charlotte Heddini, Alexander Benard, Emmanuel Neher, Richard A. Mild, Mattias Albert, Jan PLoS One Research Article Next generation sequencing technologies, like ultra-deep pyrosequencing (UDPS), allows detailed investigation of complex populations, like RNA viruses, but its utility is limited by errors introduced during sample preparation and sequencing. By tagging each individual cDNA molecule with barcodes, referred to as Primer IDs, before PCR and sequencing these errors could theoretically be removed. Here we evaluated the Primer ID methodology on 257,846 UDPS reads generated from a HIV-1 SG3Δenv plasmid clone and plasma samples from three HIV-infected patients. The Primer ID consisted of 11 randomized nucleotides, 4,194,304 combinations, in the primer for cDNA synthesis that introduced a unique sequence tag into each cDNA molecule. Consensus template sequences were constructed for reads with Primer IDs that were observed three or more times. Despite high numbers of input template molecules, the number of consensus template sequences was low. With 10,000 input molecules for the clone as few as 97 consensus template sequences were obtained due to highly skewed frequency of resampling. Furthermore, the number of sequenced templates was overestimated due to PCR errors in the Primer IDs. Finally, some consensus template sequences were erroneous due to hotspots for UDPS errors. The Primer ID methodology has the potential to provide highly accurate deep sequencing. However, it is important to be aware that there are remaining challenges with the methodology. In particular it is important to find ways to obtain a more even frequency of resampling of template molecules as well as to identify and remove artefactual consensus template sequences that have been generated by PCR errors in the Primer IDs. Public Library of Science 2015-03-05 /pmc/articles/PMC4351057/ /pubmed/25741706 http://dx.doi.org/10.1371/journal.pone.0119123 Text en © 2015 Brodin et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Brodin, Johanna
Hedskog, Charlotte
Heddini, Alexander
Benard, Emmanuel
Neher, Richard A.
Mild, Mattias
Albert, Jan
Challenges with Using Primer IDs to Improve Accuracy of Next Generation Sequencing
title Challenges with Using Primer IDs to Improve Accuracy of Next Generation Sequencing
title_full Challenges with Using Primer IDs to Improve Accuracy of Next Generation Sequencing
title_fullStr Challenges with Using Primer IDs to Improve Accuracy of Next Generation Sequencing
title_full_unstemmed Challenges with Using Primer IDs to Improve Accuracy of Next Generation Sequencing
title_short Challenges with Using Primer IDs to Improve Accuracy of Next Generation Sequencing
title_sort challenges with using primer ids to improve accuracy of next generation sequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351057/
https://www.ncbi.nlm.nih.gov/pubmed/25741706
http://dx.doi.org/10.1371/journal.pone.0119123
work_keys_str_mv AT brodinjohanna challengeswithusingprimeridstoimproveaccuracyofnextgenerationsequencing
AT hedskogcharlotte challengeswithusingprimeridstoimproveaccuracyofnextgenerationsequencing
AT heddinialexander challengeswithusingprimeridstoimproveaccuracyofnextgenerationsequencing
AT benardemmanuel challengeswithusingprimeridstoimproveaccuracyofnextgenerationsequencing
AT neherricharda challengeswithusingprimeridstoimproveaccuracyofnextgenerationsequencing
AT mildmattias challengeswithusingprimeridstoimproveaccuracyofnextgenerationsequencing
AT albertjan challengeswithusingprimeridstoimproveaccuracyofnextgenerationsequencing