Cargando…
Fast sequence analysis based on diamond sampling
Both in DNA and protein contexts, an important method for modelling motifs is to utilize position weight matrix (PWM) in biological sequences. With the development of genome sequencing technology, the quantity of the sequence data is increasing explosively, so the faster searching algorithms which h...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6023231/ https://www.ncbi.nlm.nih.gov/pubmed/29953448 http://dx.doi.org/10.1371/journal.pone.0198922 |
_version_ | 1783335824645947392 |
---|---|
author | Gao, Liangxin Bao, Wenzhen Zhang, Hongbo Yuan, Chang-An Huang, De-Shuang |
author_facet | Gao, Liangxin Bao, Wenzhen Zhang, Hongbo Yuan, Chang-An Huang, De-Shuang |
author_sort | Gao, Liangxin |
collection | PubMed |
description | Both in DNA and protein contexts, an important method for modelling motifs is to utilize position weight matrix (PWM) in biological sequences. With the development of genome sequencing technology, the quantity of the sequence data is increasing explosively, so the faster searching algorithms which have the ability to meet the increasingly need are desired to develop. In this paper, we proposed a method for speeding up the searching process of candidate transcription factor binding sites (TFBS), and the users can be allowed to specify p threshold to get the desired trade-off between speed and sensitivity for a particular sequence analysis. Moreover, the proposed method can also be generalized to large-scale annotation and sequence projects. |
format | Online Article Text |
id | pubmed-6023231 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-60232312018-07-07 Fast sequence analysis based on diamond sampling Gao, Liangxin Bao, Wenzhen Zhang, Hongbo Yuan, Chang-An Huang, De-Shuang PLoS One Research Article Both in DNA and protein contexts, an important method for modelling motifs is to utilize position weight matrix (PWM) in biological sequences. With the development of genome sequencing technology, the quantity of the sequence data is increasing explosively, so the faster searching algorithms which have the ability to meet the increasingly need are desired to develop. In this paper, we proposed a method for speeding up the searching process of candidate transcription factor binding sites (TFBS), and the users can be allowed to specify p threshold to get the desired trade-off between speed and sensitivity for a particular sequence analysis. Moreover, the proposed method can also be generalized to large-scale annotation and sequence projects. Public Library of Science 2018-06-28 /pmc/articles/PMC6023231/ /pubmed/29953448 http://dx.doi.org/10.1371/journal.pone.0198922 Text en © 2018 Gao et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Gao, Liangxin Bao, Wenzhen Zhang, Hongbo Yuan, Chang-An Huang, De-Shuang Fast sequence analysis based on diamond sampling |
title | Fast sequence analysis based on diamond sampling |
title_full | Fast sequence analysis based on diamond sampling |
title_fullStr | Fast sequence analysis based on diamond sampling |
title_full_unstemmed | Fast sequence analysis based on diamond sampling |
title_short | Fast sequence analysis based on diamond sampling |
title_sort | fast sequence analysis based on diamond sampling |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6023231/ https://www.ncbi.nlm.nih.gov/pubmed/29953448 http://dx.doi.org/10.1371/journal.pone.0198922 |
work_keys_str_mv | AT gaoliangxin fastsequenceanalysisbasedondiamondsampling AT baowenzhen fastsequenceanalysisbasedondiamondsampling AT zhanghongbo fastsequenceanalysisbasedondiamondsampling AT yuanchangan fastsequenceanalysisbasedondiamondsampling AT huangdeshuang fastsequenceanalysisbasedondiamondsampling |