Cargando…
WSE, a new sequence distance measure based on word frequencies
In this article, we present a new distance metric, the Weighted Sequence Entropy (WSE), based on the short word composition of biological sequences. As a revision of the classical relative entropy (RE), our metric (1) works equivalently with RE in the case of small k, (2) avoids the degeneracy when...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Inc.
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7185439/ https://www.ncbi.nlm.nih.gov/pubmed/18590747 http://dx.doi.org/10.1016/j.mbs.2008.06.001 |
_version_ | 1783526757812404224 |
---|---|
author | Wang, Jun Zheng, Xiaoqi |
author_facet | Wang, Jun Zheng, Xiaoqi |
author_sort | Wang, Jun |
collection | PubMed |
description | In this article, we present a new distance metric, the Weighted Sequence Entropy (WSE), based on the short word composition of biological sequences. As a revision of the classical relative entropy (RE), our metric (1) works equivalently with RE in the case of small k, (2) avoids the degeneracy when some word types are absent in one sequence but not in the other. Experiments on 25 viruses including SARS-CoVs show that our method and RE give exactly the same phylogenetic tree when word length [Formula: see text]. When [Formula: see text] , our method still works and gets convergent phylogenetic topology but the RE gives degenerate results. |
format | Online Article Text |
id | pubmed-7185439 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | Elsevier Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-71854392020-04-28 WSE, a new sequence distance measure based on word frequencies Wang, Jun Zheng, Xiaoqi Math Biosci Article In this article, we present a new distance metric, the Weighted Sequence Entropy (WSE), based on the short word composition of biological sequences. As a revision of the classical relative entropy (RE), our metric (1) works equivalently with RE in the case of small k, (2) avoids the degeneracy when some word types are absent in one sequence but not in the other. Experiments on 25 viruses including SARS-CoVs show that our method and RE give exactly the same phylogenetic tree when word length [Formula: see text]. When [Formula: see text] , our method still works and gets convergent phylogenetic topology but the RE gives degenerate results. Elsevier Inc. 2008-09 2008-06-12 /pmc/articles/PMC7185439/ /pubmed/18590747 http://dx.doi.org/10.1016/j.mbs.2008.06.001 Text en Copyright © 2008 Elsevier Inc. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Wang, Jun Zheng, Xiaoqi WSE, a new sequence distance measure based on word frequencies |
title | WSE, a new sequence distance measure based on word frequencies |
title_full | WSE, a new sequence distance measure based on word frequencies |
title_fullStr | WSE, a new sequence distance measure based on word frequencies |
title_full_unstemmed | WSE, a new sequence distance measure based on word frequencies |
title_short | WSE, a new sequence distance measure based on word frequencies |
title_sort | wse, a new sequence distance measure based on word frequencies |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7185439/ https://www.ncbi.nlm.nih.gov/pubmed/18590747 http://dx.doi.org/10.1016/j.mbs.2008.06.001 |
work_keys_str_mv | AT wangjun wseanewsequencedistancemeasurebasedonwordfrequencies AT zhengxiaoqi wseanewsequencedistancemeasurebasedonwordfrequencies |