Cargando…

Large Scale Loss of Data in Low-Diversity Illumina Sequencing Libraries Can Be Recovered by Deferred Cluster Calling

Massively parallel DNA sequencing is capable of sequencing tens of millions of DNA fragments at the same time. However, sequence bias in the initial cycles, which are used to determine the coordinates of individual clusters, causes a loss of fidelity in cluster identification on Illumina Genome Anal...

Descripción completa

Detalles Bibliográficos
Autores principales: Krueger, Felix, Andrews, Simon R., Osborne, Cameron S.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3030592/
https://www.ncbi.nlm.nih.gov/pubmed/21305042
http://dx.doi.org/10.1371/journal.pone.0016607
_version_ 1782197282054078464
author Krueger, Felix
Andrews, Simon R.
Osborne, Cameron S.
author_facet Krueger, Felix
Andrews, Simon R.
Osborne, Cameron S.
author_sort Krueger, Felix
collection PubMed
description Massively parallel DNA sequencing is capable of sequencing tens of millions of DNA fragments at the same time. However, sequence bias in the initial cycles, which are used to determine the coordinates of individual clusters, causes a loss of fidelity in cluster identification on Illumina Genome Analysers. This can result in a significant reduction in the numbers of clusters that can be analysed. Such low sample diversity is an intrinsic problem of sequencing libraries that are generated by restriction enzyme digestion, such as e4C-seq or reduced-representation libraries. Similarly, this problem can also arise through the combined sequencing of barcoded, multiplexed libraries. We describe a procedure to defer the mapping of cluster coordinates until low-diversity sequences have been passed. This simple procedure can recover substantial amounts of next generation sequencing data that would otherwise be lost.
format Text
id pubmed-3030592
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-30305922011-02-08 Large Scale Loss of Data in Low-Diversity Illumina Sequencing Libraries Can Be Recovered by Deferred Cluster Calling Krueger, Felix Andrews, Simon R. Osborne, Cameron S. PLoS One Research Article Massively parallel DNA sequencing is capable of sequencing tens of millions of DNA fragments at the same time. However, sequence bias in the initial cycles, which are used to determine the coordinates of individual clusters, causes a loss of fidelity in cluster identification on Illumina Genome Analysers. This can result in a significant reduction in the numbers of clusters that can be analysed. Such low sample diversity is an intrinsic problem of sequencing libraries that are generated by restriction enzyme digestion, such as e4C-seq or reduced-representation libraries. Similarly, this problem can also arise through the combined sequencing of barcoded, multiplexed libraries. We describe a procedure to defer the mapping of cluster coordinates until low-diversity sequences have been passed. This simple procedure can recover substantial amounts of next generation sequencing data that would otherwise be lost. Public Library of Science 2011-01-28 /pmc/articles/PMC3030592/ /pubmed/21305042 http://dx.doi.org/10.1371/journal.pone.0016607 Text en Krueger et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Krueger, Felix
Andrews, Simon R.
Osborne, Cameron S.
Large Scale Loss of Data in Low-Diversity Illumina Sequencing Libraries Can Be Recovered by Deferred Cluster Calling
title Large Scale Loss of Data in Low-Diversity Illumina Sequencing Libraries Can Be Recovered by Deferred Cluster Calling
title_full Large Scale Loss of Data in Low-Diversity Illumina Sequencing Libraries Can Be Recovered by Deferred Cluster Calling
title_fullStr Large Scale Loss of Data in Low-Diversity Illumina Sequencing Libraries Can Be Recovered by Deferred Cluster Calling
title_full_unstemmed Large Scale Loss of Data in Low-Diversity Illumina Sequencing Libraries Can Be Recovered by Deferred Cluster Calling
title_short Large Scale Loss of Data in Low-Diversity Illumina Sequencing Libraries Can Be Recovered by Deferred Cluster Calling
title_sort large scale loss of data in low-diversity illumina sequencing libraries can be recovered by deferred cluster calling
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3030592/
https://www.ncbi.nlm.nih.gov/pubmed/21305042
http://dx.doi.org/10.1371/journal.pone.0016607
work_keys_str_mv AT kruegerfelix largescalelossofdatainlowdiversityilluminasequencinglibrariescanberecoveredbydeferredclustercalling
AT andrewssimonr largescalelossofdatainlowdiversityilluminasequencinglibrariescanberecoveredbydeferredclustercalling
AT osbornecamerons largescalelossofdatainlowdiversityilluminasequencinglibrariescanberecoveredbydeferredclustercalling