Cargando…

Phylogeny reconstruction based on the length distribution of k-mismatch common substrings

BACKGROUND: Various approaches to alignment-free sequence comparison are based on the length of exact or inexact word matches between pairs of input sequences. Haubold et al. (J Comput Biol 16:1487–1500, 2009) showed how the average number of substitutions per position between two DNA sequences can...

Descripción completa

Detalles Bibliográficos
Autores principales: Morgenstern, Burkhard, Schöbel, Svenja, Leimeister, Chris-André
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5724348/
https://www.ncbi.nlm.nih.gov/pubmed/29238399
http://dx.doi.org/10.1186/s13015-017-0118-8
_version_ 1783285345462255616
author Morgenstern, Burkhard
Schöbel, Svenja
Leimeister, Chris-André
author_facet Morgenstern, Burkhard
Schöbel, Svenja
Leimeister, Chris-André
author_sort Morgenstern, Burkhard
collection PubMed
description BACKGROUND: Various approaches to alignment-free sequence comparison are based on the length of exact or inexact word matches between pairs of input sequences. Haubold et al. (J Comput Biol 16:1487–1500, 2009) showed how the average number of substitutions per position between two DNA sequences can be estimated based on the average length of exact common substrings. RESULTS: In this paper, we study the length distribution of k-mismatch common substrings between two sequences. We show that the number of substitutions per position can be accurately estimated from the position of a local maximum in the length distribution of their k-mismatch common substrings.
format Online
Article
Text
id pubmed-5724348
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57243482017-12-13 Phylogeny reconstruction based on the length distribution of k-mismatch common substrings Morgenstern, Burkhard Schöbel, Svenja Leimeister, Chris-André Algorithms Mol Biol Research BACKGROUND: Various approaches to alignment-free sequence comparison are based on the length of exact or inexact word matches between pairs of input sequences. Haubold et al. (J Comput Biol 16:1487–1500, 2009) showed how the average number of substitutions per position between two DNA sequences can be estimated based on the average length of exact common substrings. RESULTS: In this paper, we study the length distribution of k-mismatch common substrings between two sequences. We show that the number of substitutions per position can be accurately estimated from the position of a local maximum in the length distribution of their k-mismatch common substrings. BioMed Central 2017-12-11 /pmc/articles/PMC5724348/ /pubmed/29238399 http://dx.doi.org/10.1186/s13015-017-0118-8 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Morgenstern, Burkhard
Schöbel, Svenja
Leimeister, Chris-André
Phylogeny reconstruction based on the length distribution of k-mismatch common substrings
title Phylogeny reconstruction based on the length distribution of k-mismatch common substrings
title_full Phylogeny reconstruction based on the length distribution of k-mismatch common substrings
title_fullStr Phylogeny reconstruction based on the length distribution of k-mismatch common substrings
title_full_unstemmed Phylogeny reconstruction based on the length distribution of k-mismatch common substrings
title_short Phylogeny reconstruction based on the length distribution of k-mismatch common substrings
title_sort phylogeny reconstruction based on the length distribution of k-mismatch common substrings
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5724348/
https://www.ncbi.nlm.nih.gov/pubmed/29238399
http://dx.doi.org/10.1186/s13015-017-0118-8
work_keys_str_mv AT morgensternburkhard phylogenyreconstructionbasedonthelengthdistributionofkmismatchcommonsubstrings
AT schobelsvenja phylogenyreconstructionbasedonthelengthdistributionofkmismatchcommonsubstrings
AT leimeisterchrisandre phylogenyreconstructionbasedonthelengthdistributionofkmismatchcommonsubstrings