Cargando…

Near-medians that avoid the corners; a combinatorial probability approach

BACKGROUND: The breakpoint median for a set of k ≥ 3 random genomes tends to approach (any) one of these genomes ("corners") as genome length increases, although there are diminishing proportion of medians equidistant from all k ("medians in the middle"). Algorithms are likely to...

Descripción completa

Detalles Bibliográficos
Autores principales: Larlee, Caroline Anne, Zheng, Chunfang, Sankoff, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4239572/
https://www.ncbi.nlm.nih.gov/pubmed/25572274
http://dx.doi.org/10.1186/1471-2164-15-S6-S1
_version_ 1782345613471383552
author Larlee, Caroline Anne
Zheng, Chunfang
Sankoff, David
author_facet Larlee, Caroline Anne
Zheng, Chunfang
Sankoff, David
author_sort Larlee, Caroline Anne
collection PubMed
description BACKGROUND: The breakpoint median for a set of k ≥ 3 random genomes tends to approach (any) one of these genomes ("corners") as genome length increases, although there are diminishing proportion of medians equidistant from all k ("medians in the middle"). Algorithms are likely to miss the latter, and this has consequences for the general case where input genomes share some or many gene adjacencies, where the tendency for the median to be closer to one input genome may be an artifact of the corner tendency. RESULTS: We present a simple sampling procedure for constructing a "near median" that represents a compromise among k random genomes and that has only a slightly greater breakpoint distance to all of them than the median does. We generalize to the realistic case where genomes share varying proportions of gene adjacencies. We present a supplementary sampling scheme that brings the constructed genome even closer to median status. CONCLUSIONS: Our approach is of particular use in the phylogenetic context where medians are repeatedly calculated at ancestral nodes, and where the corner effect prevents different parts of the phylogeny from communicating with each other.
format Online
Article
Text
id pubmed-4239572
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42395722014-11-25 Near-medians that avoid the corners; a combinatorial probability approach Larlee, Caroline Anne Zheng, Chunfang Sankoff, David BMC Genomics Research BACKGROUND: The breakpoint median for a set of k ≥ 3 random genomes tends to approach (any) one of these genomes ("corners") as genome length increases, although there are diminishing proportion of medians equidistant from all k ("medians in the middle"). Algorithms are likely to miss the latter, and this has consequences for the general case where input genomes share some or many gene adjacencies, where the tendency for the median to be closer to one input genome may be an artifact of the corner tendency. RESULTS: We present a simple sampling procedure for constructing a "near median" that represents a compromise among k random genomes and that has only a slightly greater breakpoint distance to all of them than the median does. We generalize to the realistic case where genomes share varying proportions of gene adjacencies. We present a supplementary sampling scheme that brings the constructed genome even closer to median status. CONCLUSIONS: Our approach is of particular use in the phylogenetic context where medians are repeatedly calculated at ancestral nodes, and where the corner effect prevents different parts of the phylogeny from communicating with each other. BioMed Central 2014-10-17 /pmc/articles/PMC4239572/ /pubmed/25572274 http://dx.doi.org/10.1186/1471-2164-15-S6-S1 Text en Copyright © 2014 Larlee et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Larlee, Caroline Anne
Zheng, Chunfang
Sankoff, David
Near-medians that avoid the corners; a combinatorial probability approach
title Near-medians that avoid the corners; a combinatorial probability approach
title_full Near-medians that avoid the corners; a combinatorial probability approach
title_fullStr Near-medians that avoid the corners; a combinatorial probability approach
title_full_unstemmed Near-medians that avoid the corners; a combinatorial probability approach
title_short Near-medians that avoid the corners; a combinatorial probability approach
title_sort near-medians that avoid the corners; a combinatorial probability approach
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4239572/
https://www.ncbi.nlm.nih.gov/pubmed/25572274
http://dx.doi.org/10.1186/1471-2164-15-S6-S1
work_keys_str_mv AT larleecarolineanne nearmediansthatavoidthecornersacombinatorialprobabilityapproach
AT zhengchunfang nearmediansthatavoidthecornersacombinatorialprobabilityapproach
AT sankoffdavid nearmediansthatavoidthecornersacombinatorialprobabilityapproach