Cargando…

Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information

Sequence-based approach for motif prediction is of great interest and remains a challenge. In this work, we develop a local combinational variable approach for sequence-based helix-turn-helix (HTH) motif prediction. First we choose a sequence data set for 88 proteins of 22 amino acids in length to l...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiong, Wenwei, Li, Tonghua, Chen, Kai, Tang, Kailin
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761287/
https://www.ncbi.nlm.nih.gov/pubmed/19651875
http://dx.doi.org/10.1093/nar/gkp628
_version_ 1782172822379954176
author Xiong, Wenwei
Li, Tonghua
Chen, Kai
Tang, Kailin
author_facet Xiong, Wenwei
Li, Tonghua
Chen, Kai
Tang, Kailin
author_sort Xiong, Wenwei
collection PubMed
description Sequence-based approach for motif prediction is of great interest and remains a challenge. In this work, we develop a local combinational variable approach for sequence-based helix-turn-helix (HTH) motif prediction. First we choose a sequence data set for 88 proteins of 22 amino acids in length to launch an optimized traversal for extracting local combinational segments (LCS) from the data set. Then after LCS refinement, local combinational variables (LCV) are generated to construct prediction models for HTH motifs. Prediction ability of LCV sets at different thresholds is calculated to settle a moderate threshold. The large data set we used comprises 13 HTH families, with 17 455 sequences in total. Our approach predicts HTH motifs more precisely using only primary protein sequence information, with 93.29% accuracy, 93.93% sensitivity and 92.66% specificity. Prediction results of newly reported HTH-containing proteins compared with other prediction web service presents a good prediction model derived from the LCV approach. Comparisons with profile-HMM models from the Pfam protein families database show that the LCV approach maintains a good balance while dealing with HTH-containing proteins and non-HTH proteins at the same time. The LCV approach is to some extent a complementary to the profile-HMM models for its better identification of false-positive data. Furthermore, genome-wide predictions detect new HTH proteins in both Homo sapiens and Escherichia coli organisms, which enlarge applications of the LCV approach. Software for mining LCVs from sequence data set can be obtained from anonymous ftp site ftp://cheminfo.tongji.edu.cn/LCV/freely.
format Text
id pubmed-2761287
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-27612872009-10-14 Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information Xiong, Wenwei Li, Tonghua Chen, Kai Tang, Kailin Nucleic Acids Res Computational Biology Sequence-based approach for motif prediction is of great interest and remains a challenge. In this work, we develop a local combinational variable approach for sequence-based helix-turn-helix (HTH) motif prediction. First we choose a sequence data set for 88 proteins of 22 amino acids in length to launch an optimized traversal for extracting local combinational segments (LCS) from the data set. Then after LCS refinement, local combinational variables (LCV) are generated to construct prediction models for HTH motifs. Prediction ability of LCV sets at different thresholds is calculated to settle a moderate threshold. The large data set we used comprises 13 HTH families, with 17 455 sequences in total. Our approach predicts HTH motifs more precisely using only primary protein sequence information, with 93.29% accuracy, 93.93% sensitivity and 92.66% specificity. Prediction results of newly reported HTH-containing proteins compared with other prediction web service presents a good prediction model derived from the LCV approach. Comparisons with profile-HMM models from the Pfam protein families database show that the LCV approach maintains a good balance while dealing with HTH-containing proteins and non-HTH proteins at the same time. The LCV approach is to some extent a complementary to the profile-HMM models for its better identification of false-positive data. Furthermore, genome-wide predictions detect new HTH proteins in both Homo sapiens and Escherichia coli organisms, which enlarge applications of the LCV approach. Software for mining LCVs from sequence data set can be obtained from anonymous ftp site ftp://cheminfo.tongji.edu.cn/LCV/freely. Oxford University Press 2009-09 2009-08-03 /pmc/articles/PMC2761287/ /pubmed/19651875 http://dx.doi.org/10.1093/nar/gkp628 Text en © 2009 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Xiong, Wenwei
Li, Tonghua
Chen, Kai
Tang, Kailin
Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information
title Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information
title_full Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information
title_fullStr Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information
title_full_unstemmed Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information
title_short Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information
title_sort local combinational variables: an approach used in dna-binding helix-turn-helix motif prediction with sequence information
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761287/
https://www.ncbi.nlm.nih.gov/pubmed/19651875
http://dx.doi.org/10.1093/nar/gkp628
work_keys_str_mv AT xiongwenwei localcombinationalvariablesanapproachusedindnabindinghelixturnhelixmotifpredictionwithsequenceinformation
AT litonghua localcombinationalvariablesanapproachusedindnabindinghelixturnhelixmotifpredictionwithsequenceinformation
AT chenkai localcombinationalvariablesanapproachusedindnabindinghelixturnhelixmotifpredictionwithsequenceinformation
AT tangkailin localcombinationalvariablesanapproachusedindnabindinghelixturnhelixmotifpredictionwithsequenceinformation