Cargando…

Study of LZ-word distribution and its application for sequence comparison

Lempel–Ziv complexity has been widely used for sequence comparison and achieved promising results, but until now components' distribution in exhaustive history has not been studied. This paper investigated the whole distribution of LZ-words and presented a novel statistical method for sequence...

Descripción completa

Detalles Bibliográficos
Autores principales: Dai, Qi, Yan, Zhaofang, Shi, Zhuoxing, Liu, Xiaoqing, Yao, Yuhua, He, Pingan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7094135/
https://www.ncbi.nlm.nih.gov/pubmed/23876763
http://dx.doi.org/10.1016/j.jtbi.2013.07.008
Descripción
Sumario:Lempel–Ziv complexity has been widely used for sequence comparison and achieved promising results, but until now components' distribution in exhaustive history has not been studied. This paper investigated the whole distribution of LZ-words and presented a novel statistical method for sequence comparison. With the components' length in mind, we revised Lempel–Ziv complexity and obtained various sets of LZ-words. Instead of calculating the LZ-words' contents, we defined a series of set operations on LZ-word set to compare biological sequences. In order to assess the effectiveness of the proposed method, we performed two sets of experiments and compared it with alignment-based methods.