Cargando…

Fast sequence analysis based on diamond sampling

Both in DNA and protein contexts, an important method for modelling motifs is to utilize position weight matrix (PWM) in biological sequences. With the development of genome sequencing technology, the quantity of the sequence data is increasing explosively, so the faster searching algorithms which h...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Liangxin, Bao, Wenzhen, Zhang, Hongbo, Yuan, Chang-An, Huang, De-Shuang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6023231/
https://www.ncbi.nlm.nih.gov/pubmed/29953448
http://dx.doi.org/10.1371/journal.pone.0198922
_version_ 1783335824645947392
author Gao, Liangxin
Bao, Wenzhen
Zhang, Hongbo
Yuan, Chang-An
Huang, De-Shuang
author_facet Gao, Liangxin
Bao, Wenzhen
Zhang, Hongbo
Yuan, Chang-An
Huang, De-Shuang
author_sort Gao, Liangxin
collection PubMed
description Both in DNA and protein contexts, an important method for modelling motifs is to utilize position weight matrix (PWM) in biological sequences. With the development of genome sequencing technology, the quantity of the sequence data is increasing explosively, so the faster searching algorithms which have the ability to meet the increasingly need are desired to develop. In this paper, we proposed a method for speeding up the searching process of candidate transcription factor binding sites (TFBS), and the users can be allowed to specify p threshold to get the desired trade-off between speed and sensitivity for a particular sequence analysis. Moreover, the proposed method can also be generalized to large-scale annotation and sequence projects.
format Online
Article
Text
id pubmed-6023231
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-60232312018-07-07 Fast sequence analysis based on diamond sampling Gao, Liangxin Bao, Wenzhen Zhang, Hongbo Yuan, Chang-An Huang, De-Shuang PLoS One Research Article Both in DNA and protein contexts, an important method for modelling motifs is to utilize position weight matrix (PWM) in biological sequences. With the development of genome sequencing technology, the quantity of the sequence data is increasing explosively, so the faster searching algorithms which have the ability to meet the increasingly need are desired to develop. In this paper, we proposed a method for speeding up the searching process of candidate transcription factor binding sites (TFBS), and the users can be allowed to specify p threshold to get the desired trade-off between speed and sensitivity for a particular sequence analysis. Moreover, the proposed method can also be generalized to large-scale annotation and sequence projects. Public Library of Science 2018-06-28 /pmc/articles/PMC6023231/ /pubmed/29953448 http://dx.doi.org/10.1371/journal.pone.0198922 Text en © 2018 Gao et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Gao, Liangxin
Bao, Wenzhen
Zhang, Hongbo
Yuan, Chang-An
Huang, De-Shuang
Fast sequence analysis based on diamond sampling
title Fast sequence analysis based on diamond sampling
title_full Fast sequence analysis based on diamond sampling
title_fullStr Fast sequence analysis based on diamond sampling
title_full_unstemmed Fast sequence analysis based on diamond sampling
title_short Fast sequence analysis based on diamond sampling
title_sort fast sequence analysis based on diamond sampling
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6023231/
https://www.ncbi.nlm.nih.gov/pubmed/29953448
http://dx.doi.org/10.1371/journal.pone.0198922
work_keys_str_mv AT gaoliangxin fastsequenceanalysisbasedondiamondsampling
AT baowenzhen fastsequenceanalysisbasedondiamondsampling
AT zhanghongbo fastsequenceanalysisbasedondiamondsampling
AT yuanchangan fastsequenceanalysisbasedondiamondsampling
AT huangdeshuang fastsequenceanalysisbasedondiamondsampling