Cargando…

Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform

Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations o...

Descripción completa

Detalles Bibliográficos
Autores principales: Mitra, Abhishek, Skrzypczak, Magdalena, Ginalski, Krzysztof, Rowicka, Maga
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393298/
https://www.ncbi.nlm.nih.gov/pubmed/25860802
http://dx.doi.org/10.1371/journal.pone.0120520
_version_ 1782366145916960768
author Mitra, Abhishek
Skrzypczak, Magdalena
Ginalski, Krzysztof
Rowicka, Maga
author_facet Mitra, Abhishek
Skrzypczak, Magdalena
Ginalski, Krzysztof
Rowicka, Maga
author_sort Mitra, Abhishek
collection PubMed
description Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer’s, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how analysis can be repeated from saved sequencing images using the Long Template Protocol to increase accuracy.
format Online
Article
Text
id pubmed-4393298
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43932982015-04-21 Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform Mitra, Abhishek Skrzypczak, Magdalena Ginalski, Krzysztof Rowicka, Maga PLoS One Research Article Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer’s, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how analysis can be repeated from saved sequencing images using the Long Template Protocol to increase accuracy. Public Library of Science 2015-04-10 /pmc/articles/PMC4393298/ /pubmed/25860802 http://dx.doi.org/10.1371/journal.pone.0120520 Text en © 2015 Mitra et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Mitra, Abhishek
Skrzypczak, Magdalena
Ginalski, Krzysztof
Rowicka, Maga
Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform
title Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform
title_full Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform
title_fullStr Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform
title_full_unstemmed Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform
title_short Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform
title_sort strategies for achieving high sequencing accuracy for low diversity samples and avoiding sample bleeding using illumina platform
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393298/
https://www.ncbi.nlm.nih.gov/pubmed/25860802
http://dx.doi.org/10.1371/journal.pone.0120520
work_keys_str_mv AT mitraabhishek strategiesforachievinghighsequencingaccuracyforlowdiversitysamplesandavoidingsamplebleedingusingilluminaplatform
AT skrzypczakmagdalena strategiesforachievinghighsequencingaccuracyforlowdiversitysamplesandavoidingsamplebleedingusingilluminaplatform
AT ginalskikrzysztof strategiesforachievinghighsequencingaccuracyforlowdiversitysamplesandavoidingsamplebleedingusingilluminaplatform
AT rowickamaga strategiesforachievinghighsequencingaccuracyforlowdiversitysamplesandavoidingsamplebleedingusingilluminaplatform