Cargando…

SASI-Seq: sample assurance Spike-Ins, and highly differentiating 384 barcoding for Illumina sequencing

BACKGROUND: A minor but significant fraction of samples subjected to next-generation sequencing methods are either mixed-up or cross-contaminated. These events can lead to false or inconclusive results. We have therefore developed SASI-Seq; a process whereby a set of uniquely barcoded DNA fragments...

Descripción completa

Detalles Bibliográficos
Autores principales: Quail, Michael A, Smith, Miriam, Jackson, David, Leonard, Steven, Skelly, Thomas, Swerdlow, Harold P, Gu, Yong, Ellis, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4008303/
https://www.ncbi.nlm.nih.gov/pubmed/24507442
http://dx.doi.org/10.1186/1471-2164-15-110
_version_ 1782314422976380928
author Quail, Michael A
Smith, Miriam
Jackson, David
Leonard, Steven
Skelly, Thomas
Swerdlow, Harold P
Gu, Yong
Ellis, Peter
author_facet Quail, Michael A
Smith, Miriam
Jackson, David
Leonard, Steven
Skelly, Thomas
Swerdlow, Harold P
Gu, Yong
Ellis, Peter
author_sort Quail, Michael A
collection PubMed
description BACKGROUND: A minor but significant fraction of samples subjected to next-generation sequencing methods are either mixed-up or cross-contaminated. These events can lead to false or inconclusive results. We have therefore developed SASI-Seq; a process whereby a set of uniquely barcoded DNA fragments are added to samples destined for sequencing. From the final sequencing data, one can verify that all the reads derive from the original sample(s) and not from contaminants or other samples. RESULTS: By adding a mixture of three uniquely barcoded amplicons, of different sizes spanning the range of insert sizes one would normally use for Illumina sequencing, at a spike-in level of approximately 0.1%, we demonstrate that these fragments remain intimately associated with the sample. They can be detected following even the tightest size selection regimes or exome enrichment and can report the occurrence of sample mix-ups and cross-contamination. As a consequence of this work, we have designed a set of 384 eleven-base Illumina barcode sequences that are at least 5 changes apart from each other, allowing for single-error correction and very low levels of barcode misallocation due to sequencing error. CONCLUSION: SASI-Seq is a simple, inexpensive and flexible tool that enables sample assurance, allows deconvolution of sample mix-ups and reports levels of cross-contamination between samples throughout NGS workflows. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-110) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4008303
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40083032014-05-03 SASI-Seq: sample assurance Spike-Ins, and highly differentiating 384 barcoding for Illumina sequencing Quail, Michael A Smith, Miriam Jackson, David Leonard, Steven Skelly, Thomas Swerdlow, Harold P Gu, Yong Ellis, Peter BMC Genomics Methodology Article BACKGROUND: A minor but significant fraction of samples subjected to next-generation sequencing methods are either mixed-up or cross-contaminated. These events can lead to false or inconclusive results. We have therefore developed SASI-Seq; a process whereby a set of uniquely barcoded DNA fragments are added to samples destined for sequencing. From the final sequencing data, one can verify that all the reads derive from the original sample(s) and not from contaminants or other samples. RESULTS: By adding a mixture of three uniquely barcoded amplicons, of different sizes spanning the range of insert sizes one would normally use for Illumina sequencing, at a spike-in level of approximately 0.1%, we demonstrate that these fragments remain intimately associated with the sample. They can be detected following even the tightest size selection regimes or exome enrichment and can report the occurrence of sample mix-ups and cross-contamination. As a consequence of this work, we have designed a set of 384 eleven-base Illumina barcode sequences that are at least 5 changes apart from each other, allowing for single-error correction and very low levels of barcode misallocation due to sequencing error. CONCLUSION: SASI-Seq is a simple, inexpensive and flexible tool that enables sample assurance, allows deconvolution of sample mix-ups and reports levels of cross-contamination between samples throughout NGS workflows. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-110) contains supplementary material, which is available to authorized users. BioMed Central 2014-02-07 /pmc/articles/PMC4008303/ /pubmed/24507442 http://dx.doi.org/10.1186/1471-2164-15-110 Text en © Quail et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Quail, Michael A
Smith, Miriam
Jackson, David
Leonard, Steven
Skelly, Thomas
Swerdlow, Harold P
Gu, Yong
Ellis, Peter
SASI-Seq: sample assurance Spike-Ins, and highly differentiating 384 barcoding for Illumina sequencing
title SASI-Seq: sample assurance Spike-Ins, and highly differentiating 384 barcoding for Illumina sequencing
title_full SASI-Seq: sample assurance Spike-Ins, and highly differentiating 384 barcoding for Illumina sequencing
title_fullStr SASI-Seq: sample assurance Spike-Ins, and highly differentiating 384 barcoding for Illumina sequencing
title_full_unstemmed SASI-Seq: sample assurance Spike-Ins, and highly differentiating 384 barcoding for Illumina sequencing
title_short SASI-Seq: sample assurance Spike-Ins, and highly differentiating 384 barcoding for Illumina sequencing
title_sort sasi-seq: sample assurance spike-ins, and highly differentiating 384 barcoding for illumina sequencing
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4008303/
https://www.ncbi.nlm.nih.gov/pubmed/24507442
http://dx.doi.org/10.1186/1471-2164-15-110
work_keys_str_mv AT quailmichaela sasiseqsampleassurancespikeinsandhighlydifferentiating384barcodingforilluminasequencing
AT smithmiriam sasiseqsampleassurancespikeinsandhighlydifferentiating384barcodingforilluminasequencing
AT jacksondavid sasiseqsampleassurancespikeinsandhighlydifferentiating384barcodingforilluminasequencing
AT leonardsteven sasiseqsampleassurancespikeinsandhighlydifferentiating384barcodingforilluminasequencing
AT skellythomas sasiseqsampleassurancespikeinsandhighlydifferentiating384barcodingforilluminasequencing
AT swerdlowharoldp sasiseqsampleassurancespikeinsandhighlydifferentiating384barcodingforilluminasequencing
AT guyong sasiseqsampleassurancespikeinsandhighlydifferentiating384barcodingforilluminasequencing
AT ellispeter sasiseqsampleassurancespikeinsandhighlydifferentiating384barcodingforilluminasequencing