Cargando…
Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information
Sequence-based approach for motif prediction is of great interest and remains a challenge. In this work, we develop a local combinational variable approach for sequence-based helix-turn-helix (HTH) motif prediction. First we choose a sequence data set for 88 proteins of 22 amino acids in length to l...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761287/ https://www.ncbi.nlm.nih.gov/pubmed/19651875 http://dx.doi.org/10.1093/nar/gkp628 |
_version_ | 1782172822379954176 |
---|---|
author | Xiong, Wenwei Li, Tonghua Chen, Kai Tang, Kailin |
author_facet | Xiong, Wenwei Li, Tonghua Chen, Kai Tang, Kailin |
author_sort | Xiong, Wenwei |
collection | PubMed |
description | Sequence-based approach for motif prediction is of great interest and remains a challenge. In this work, we develop a local combinational variable approach for sequence-based helix-turn-helix (HTH) motif prediction. First we choose a sequence data set for 88 proteins of 22 amino acids in length to launch an optimized traversal for extracting local combinational segments (LCS) from the data set. Then after LCS refinement, local combinational variables (LCV) are generated to construct prediction models for HTH motifs. Prediction ability of LCV sets at different thresholds is calculated to settle a moderate threshold. The large data set we used comprises 13 HTH families, with 17 455 sequences in total. Our approach predicts HTH motifs more precisely using only primary protein sequence information, with 93.29% accuracy, 93.93% sensitivity and 92.66% specificity. Prediction results of newly reported HTH-containing proteins compared with other prediction web service presents a good prediction model derived from the LCV approach. Comparisons with profile-HMM models from the Pfam protein families database show that the LCV approach maintains a good balance while dealing with HTH-containing proteins and non-HTH proteins at the same time. The LCV approach is to some extent a complementary to the profile-HMM models for its better identification of false-positive data. Furthermore, genome-wide predictions detect new HTH proteins in both Homo sapiens and Escherichia coli organisms, which enlarge applications of the LCV approach. Software for mining LCVs from sequence data set can be obtained from anonymous ftp site ftp://cheminfo.tongji.edu.cn/LCV/freely. |
format | Text |
id | pubmed-2761287 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-27612872009-10-14 Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information Xiong, Wenwei Li, Tonghua Chen, Kai Tang, Kailin Nucleic Acids Res Computational Biology Sequence-based approach for motif prediction is of great interest and remains a challenge. In this work, we develop a local combinational variable approach for sequence-based helix-turn-helix (HTH) motif prediction. First we choose a sequence data set for 88 proteins of 22 amino acids in length to launch an optimized traversal for extracting local combinational segments (LCS) from the data set. Then after LCS refinement, local combinational variables (LCV) are generated to construct prediction models for HTH motifs. Prediction ability of LCV sets at different thresholds is calculated to settle a moderate threshold. The large data set we used comprises 13 HTH families, with 17 455 sequences in total. Our approach predicts HTH motifs more precisely using only primary protein sequence information, with 93.29% accuracy, 93.93% sensitivity and 92.66% specificity. Prediction results of newly reported HTH-containing proteins compared with other prediction web service presents a good prediction model derived from the LCV approach. Comparisons with profile-HMM models from the Pfam protein families database show that the LCV approach maintains a good balance while dealing with HTH-containing proteins and non-HTH proteins at the same time. The LCV approach is to some extent a complementary to the profile-HMM models for its better identification of false-positive data. Furthermore, genome-wide predictions detect new HTH proteins in both Homo sapiens and Escherichia coli organisms, which enlarge applications of the LCV approach. Software for mining LCVs from sequence data set can be obtained from anonymous ftp site ftp://cheminfo.tongji.edu.cn/LCV/freely. Oxford University Press 2009-09 2009-08-03 /pmc/articles/PMC2761287/ /pubmed/19651875 http://dx.doi.org/10.1093/nar/gkp628 Text en © 2009 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Computational Biology Xiong, Wenwei Li, Tonghua Chen, Kai Tang, Kailin Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information |
title | Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information |
title_full | Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information |
title_fullStr | Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information |
title_full_unstemmed | Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information |
title_short | Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information |
title_sort | local combinational variables: an approach used in dna-binding helix-turn-helix motif prediction with sequence information |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761287/ https://www.ncbi.nlm.nih.gov/pubmed/19651875 http://dx.doi.org/10.1093/nar/gkp628 |
work_keys_str_mv | AT xiongwenwei localcombinationalvariablesanapproachusedindnabindinghelixturnhelixmotifpredictionwithsequenceinformation AT litonghua localcombinationalvariablesanapproachusedindnabindinghelixturnhelixmotifpredictionwithsequenceinformation AT chenkai localcombinationalvariablesanapproachusedindnabindinghelixturnhelixmotifpredictionwithsequenceinformation AT tangkailin localcombinationalvariablesanapproachusedindnabindinghelixturnhelixmotifpredictionwithsequenceinformation |