Cargando…

LZ Complexity Distance of DNA Sequences and Its Application in Phylogenetic Tree Reconstruction

DNA sequences can be treated as finite-length symbol strings over a four-letter alphabet (A, C, T, G). As a universal and computable complexity measure, LZ complexity is valid to describe the complexity of DNA sequences. In this study, a concept of conditional LZ complexity between two sequences is...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Bin, Li, Yi-Bing, He, Hong-Bo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5172548/
https://www.ncbi.nlm.nih.gov/pubmed/16689687
http://dx.doi.org/10.1016/S1672-0229(05)03028-7
_version_ 1782484146877104128
author Li, Bin
Li, Yi-Bing
He, Hong-Bo
author_facet Li, Bin
Li, Yi-Bing
He, Hong-Bo
author_sort Li, Bin
collection PubMed
description DNA sequences can be treated as finite-length symbol strings over a four-letter alphabet (A, C, T, G). As a universal and computable complexity measure, LZ complexity is valid to describe the complexity of DNA sequences. In this study, a concept of conditional LZ complexity between two sequences is proposed according to the principle of LZ complexity measure. An LZ complexity distance metric between two nonnull sequences is defined by utilizing conditional LZ complexity. Based on LZ complexity distance, a phylogenetic tree of 26 species of placental mammals (Eutheria) with three outgroup species was reconstructed from their complete mitochondrial genomes. On the debate that which two of the three main groups of placental mammals, namely Primates, Ferungulates, and Rodents, are more closely related, the phylogenetic tree reconstructed based on LZ complexity distance supports the suggestion that Primates and Ferungulates are more closely related.
format Online
Article
Text
id pubmed-5172548
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-51725482016-12-23 LZ Complexity Distance of DNA Sequences and Its Application in Phylogenetic Tree Reconstruction Li, Bin Li, Yi-Bing He, Hong-Bo Genomics Proteomics Bioinformatics Article DNA sequences can be treated as finite-length symbol strings over a four-letter alphabet (A, C, T, G). As a universal and computable complexity measure, LZ complexity is valid to describe the complexity of DNA sequences. In this study, a concept of conditional LZ complexity between two sequences is proposed according to the principle of LZ complexity measure. An LZ complexity distance metric between two nonnull sequences is defined by utilizing conditional LZ complexity. Based on LZ complexity distance, a phylogenetic tree of 26 species of placental mammals (Eutheria) with three outgroup species was reconstructed from their complete mitochondrial genomes. On the debate that which two of the three main groups of placental mammals, namely Primates, Ferungulates, and Rodents, are more closely related, the phylogenetic tree reconstructed based on LZ complexity distance supports the suggestion that Primates and Ferungulates are more closely related. Elsevier 2005 2016-11-28 /pmc/articles/PMC5172548/ /pubmed/16689687 http://dx.doi.org/10.1016/S1672-0229(05)03028-7 Text en . http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Li, Bin
Li, Yi-Bing
He, Hong-Bo
LZ Complexity Distance of DNA Sequences and Its Application in Phylogenetic Tree Reconstruction
title LZ Complexity Distance of DNA Sequences and Its Application in Phylogenetic Tree Reconstruction
title_full LZ Complexity Distance of DNA Sequences and Its Application in Phylogenetic Tree Reconstruction
title_fullStr LZ Complexity Distance of DNA Sequences and Its Application in Phylogenetic Tree Reconstruction
title_full_unstemmed LZ Complexity Distance of DNA Sequences and Its Application in Phylogenetic Tree Reconstruction
title_short LZ Complexity Distance of DNA Sequences and Its Application in Phylogenetic Tree Reconstruction
title_sort lz complexity distance of dna sequences and its application in phylogenetic tree reconstruction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5172548/
https://www.ncbi.nlm.nih.gov/pubmed/16689687
http://dx.doi.org/10.1016/S1672-0229(05)03028-7
work_keys_str_mv AT libin lzcomplexitydistanceofdnasequencesanditsapplicationinphylogenetictreereconstruction
AT liyibing lzcomplexitydistanceofdnasequencesanditsapplicationinphylogenetictreereconstruction
AT hehongbo lzcomplexitydistanceofdnasequencesanditsapplicationinphylogenetictreereconstruction