Cargando…

Extension of Lander-Waterman theory for sequencing filtered DNA libraries

BACKGROUND: The degree to which conventional DNA sequencing techniques will be successful for highly repetitive genomes is unclear. Investigators are therefore considering various filtering methods to select against high-copy sequence in DNA clone libraries. The standard model for random sequencing,...

Descripción completa

Detalles Bibliográficos
Autores principales: Wendl, Michael C, Barbazuk, W Brad
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1280921/
https://www.ncbi.nlm.nih.gov/pubmed/16216129
http://dx.doi.org/10.1186/1471-2105-6-245
_version_ 1782126113332396032
author Wendl, Michael C
Barbazuk, W Brad
author_facet Wendl, Michael C
Barbazuk, W Brad
author_sort Wendl, Michael C
collection PubMed
description BACKGROUND: The degree to which conventional DNA sequencing techniques will be successful for highly repetitive genomes is unclear. Investigators are therefore considering various filtering methods to select against high-copy sequence in DNA clone libraries. The standard model for random sequencing, Lander-Waterman theory, does not account for two important issues in such libraries, discontinuities and position-based sampling biases (the so-called "edge effect"). We report an extension of the theory for analyzing such configurations. RESULTS: The edge effect cannot be neglected in most cases. Specifically, rates of coverage and gap reduction are appreciably lower than those for conventional libraries, as predicted by standard theory. Performance decreases as read length increases relative to island size. Although opposite of what happens in a conventional library, this apparent paradox is readily explained in terms of the edge effect. The model agrees well with prototype gene-tagging experiments for Zea mays and Sorghum bicolor. Moreover, the associated density function suggests well-defined probabilistic milestones for the number of reads necessary to capture a given fraction of the gene space. An exception for applying standard theory arises if sequence redundancy is less than about 1-fold. Here, evolution of the random quantities is independent of library gaps and edge effects. This observation effectively validates the practice of using standard theory to estimate the genic enrichment of a library based on light shotgun sequencing. CONCLUSION: Coverage performance using a filtered library is significantly lower than that for an equivalent-sized conventional library, suggesting that directed methods may be more critical for the former. The proposed model should be useful for analyzing future projects.
format Text
id pubmed-1280921
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-12809212005-11-10 Extension of Lander-Waterman theory for sequencing filtered DNA libraries Wendl, Michael C Barbazuk, W Brad BMC Bioinformatics Research Article BACKGROUND: The degree to which conventional DNA sequencing techniques will be successful for highly repetitive genomes is unclear. Investigators are therefore considering various filtering methods to select against high-copy sequence in DNA clone libraries. The standard model for random sequencing, Lander-Waterman theory, does not account for two important issues in such libraries, discontinuities and position-based sampling biases (the so-called "edge effect"). We report an extension of the theory for analyzing such configurations. RESULTS: The edge effect cannot be neglected in most cases. Specifically, rates of coverage and gap reduction are appreciably lower than those for conventional libraries, as predicted by standard theory. Performance decreases as read length increases relative to island size. Although opposite of what happens in a conventional library, this apparent paradox is readily explained in terms of the edge effect. The model agrees well with prototype gene-tagging experiments for Zea mays and Sorghum bicolor. Moreover, the associated density function suggests well-defined probabilistic milestones for the number of reads necessary to capture a given fraction of the gene space. An exception for applying standard theory arises if sequence redundancy is less than about 1-fold. Here, evolution of the random quantities is independent of library gaps and edge effects. This observation effectively validates the practice of using standard theory to estimate the genic enrichment of a library based on light shotgun sequencing. CONCLUSION: Coverage performance using a filtered library is significantly lower than that for an equivalent-sized conventional library, suggesting that directed methods may be more critical for the former. The proposed model should be useful for analyzing future projects. BioMed Central 2005-10-10 /pmc/articles/PMC1280921/ /pubmed/16216129 http://dx.doi.org/10.1186/1471-2105-6-245 Text en Copyright © 2005 Wendl and Barbazuk; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wendl, Michael C
Barbazuk, W Brad
Extension of Lander-Waterman theory for sequencing filtered DNA libraries
title Extension of Lander-Waterman theory for sequencing filtered DNA libraries
title_full Extension of Lander-Waterman theory for sequencing filtered DNA libraries
title_fullStr Extension of Lander-Waterman theory for sequencing filtered DNA libraries
title_full_unstemmed Extension of Lander-Waterman theory for sequencing filtered DNA libraries
title_short Extension of Lander-Waterman theory for sequencing filtered DNA libraries
title_sort extension of lander-waterman theory for sequencing filtered dna libraries
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1280921/
https://www.ncbi.nlm.nih.gov/pubmed/16216129
http://dx.doi.org/10.1186/1471-2105-6-245
work_keys_str_mv AT wendlmichaelc extensionoflanderwatermantheoryforsequencingfiltereddnalibraries
AT barbazukwbrad extensionoflanderwatermantheoryforsequencingfiltereddnalibraries