Cargando…

Fast sequence analysis based on diamond sampling

Both in DNA and protein contexts, an important method for modelling motifs is to utilize position weight matrix (PWM) in biological sequences. With the development of genome sequencing technology, the quantity of the sequence data is increasing explosively, so the faster searching algorithms which h...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Liangxin, Bao, Wenzhen, Zhang, Hongbo, Yuan, Chang-An, Huang, De-Shuang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6023231/
https://www.ncbi.nlm.nih.gov/pubmed/29953448
http://dx.doi.org/10.1371/journal.pone.0198922
Descripción
Sumario:Both in DNA and protein contexts, an important method for modelling motifs is to utilize position weight matrix (PWM) in biological sequences. With the development of genome sequencing technology, the quantity of the sequence data is increasing explosively, so the faster searching algorithms which have the ability to meet the increasingly need are desired to develop. In this paper, we proposed a method for speeding up the searching process of candidate transcription factor binding sites (TFBS), and the users can be allowed to specify p threshold to get the desired trade-off between speed and sensitivity for a particular sequence analysis. Moreover, the proposed method can also be generalized to large-scale annotation and sequence projects.