Cargando…

Bounds on the distribution of the number of gaps when circles and lines are covered by fragments: Theory and practical application to genomic and metagenomic projects

BACKGROUND: The question of how a circle or line segment becomes covered when random arcs are marked off has arisen repeatedly in bioinformatics. The number of uncovered gaps is of particular interest. Approximate distributions for the number of gaps have been given in the literature, one motivation...

Descripción completa

Detalles Bibliográficos
Autores principales: Moriarty, John, Marchesi, Julian R, Metcalfe, Anthony
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1821341/
https://www.ncbi.nlm.nih.gov/pubmed/17335566
http://dx.doi.org/10.1186/1471-2105-8-70
_version_ 1782132687112241152
author Moriarty, John
Marchesi, Julian R
Metcalfe, Anthony
author_facet Moriarty, John
Marchesi, Julian R
Metcalfe, Anthony
author_sort Moriarty, John
collection PubMed
description BACKGROUND: The question of how a circle or line segment becomes covered when random arcs are marked off has arisen repeatedly in bioinformatics. The number of uncovered gaps is of particular interest. Approximate distributions for the number of gaps have been given in the literature, one motivation being ease of computation. Error bounds for these approximate distributions have not been given. RESULTS: We give bounds on the probability distribution of the number of gaps when a circle is covered by fragments of fixed size. The absolute error in the approximation is typically on the order of 0.1% at 10× coverage depth. The method can be applied to coverage problems on the interval, including edge effects, and applications are given to metagenomic libraries and shotgun sequencing.
format Text
id pubmed-1821341
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18213412007-03-19 Bounds on the distribution of the number of gaps when circles and lines are covered by fragments: Theory and practical application to genomic and metagenomic projects Moriarty, John Marchesi, Julian R Metcalfe, Anthony BMC Bioinformatics Methodology Article BACKGROUND: The question of how a circle or line segment becomes covered when random arcs are marked off has arisen repeatedly in bioinformatics. The number of uncovered gaps is of particular interest. Approximate distributions for the number of gaps have been given in the literature, one motivation being ease of computation. Error bounds for these approximate distributions have not been given. RESULTS: We give bounds on the probability distribution of the number of gaps when a circle is covered by fragments of fixed size. The absolute error in the approximation is typically on the order of 0.1% at 10× coverage depth. The method can be applied to coverage problems on the interval, including edge effects, and applications are given to metagenomic libraries and shotgun sequencing. BioMed Central 2007-03-02 /pmc/articles/PMC1821341/ /pubmed/17335566 http://dx.doi.org/10.1186/1471-2105-8-70 Text en Copyright © 2007 Moriarty et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Moriarty, John
Marchesi, Julian R
Metcalfe, Anthony
Bounds on the distribution of the number of gaps when circles and lines are covered by fragments: Theory and practical application to genomic and metagenomic projects
title Bounds on the distribution of the number of gaps when circles and lines are covered by fragments: Theory and practical application to genomic and metagenomic projects
title_full Bounds on the distribution of the number of gaps when circles and lines are covered by fragments: Theory and practical application to genomic and metagenomic projects
title_fullStr Bounds on the distribution of the number of gaps when circles and lines are covered by fragments: Theory and practical application to genomic and metagenomic projects
title_full_unstemmed Bounds on the distribution of the number of gaps when circles and lines are covered by fragments: Theory and practical application to genomic and metagenomic projects
title_short Bounds on the distribution of the number of gaps when circles and lines are covered by fragments: Theory and practical application to genomic and metagenomic projects
title_sort bounds on the distribution of the number of gaps when circles and lines are covered by fragments: theory and practical application to genomic and metagenomic projects
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1821341/
https://www.ncbi.nlm.nih.gov/pubmed/17335566
http://dx.doi.org/10.1186/1471-2105-8-70
work_keys_str_mv AT moriartyjohn boundsonthedistributionofthenumberofgapswhencirclesandlinesarecoveredbyfragmentstheoryandpracticalapplicationtogenomicandmetagenomicprojects
AT marchesijulianr boundsonthedistributionofthenumberofgapswhencirclesandlinesarecoveredbyfragmentstheoryandpracticalapplicationtogenomicandmetagenomicprojects
AT metcalfeanthony boundsonthedistributionofthenumberofgapswhencirclesandlinesarecoveredbyfragmentstheoryandpracticalapplicationtogenomicandmetagenomicprojects