Cargando…

PSSM-based prediction of DNA binding sites in proteins

BACKGROUND: Detection of DNA-binding sites in proteins is of enormous interest for technologies targeting gene regulation and manipulation. We have previously shown that a residue and its sequence neighbor information can be used to predict DNA-binding candidates in a protein sequence. This sequence...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ahmad, Shandar, Sarai, Akinori
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2005
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC550660/ https://www.ncbi.nlm.nih.gov/pubmed/15720719 http://dx.doi.org/10.1186/1471-2105-6-33

_version_	1782122454101000192
author	Ahmad, Shandar Sarai, Akinori
author_facet	Ahmad, Shandar Sarai, Akinori
author_sort	Ahmad, Shandar
collection	PubMed
description	BACKGROUND: Detection of DNA-binding sites in proteins is of enormous interest for technologies targeting gene regulation and manipulation. We have previously shown that a residue and its sequence neighbor information can be used to predict DNA-binding candidates in a protein sequence. This sequence-based prediction method is applicable even if no sequence homology with a previously known DNA-binding protein is observed. Here we implement a neural network based algorithm to utilize evolutionary information of amino acid sequences in terms of their position specific scoring matrices (PSSMs) for a better prediction of DNA-binding sites. RESULTS: An average of sensitivity and specificity using PSSMs is up to 8.7% better than the prediction with sequence information only. Much smaller data sets could be used to generate PSSM with minimal loss of prediction accuracy. CONCLUSION: One problem in using PSSM-derived prediction is obtaining lengthy and time-consuming alignments against large sequence databases. In order to speed up the process of generating PSSMs, we tried to use different reference data sets (sequence space) against which a target protein is scanned for PSI-BLAST iterations. We find that a very small set of proteins can actually be used as such a reference data without losing much of the prediction value. This makes the process of generating PSSMs very rapid and even amenable to be used at a genome level. A web server has been developed to provide these predictions of DNA-binding sites for any new protein from its amino acid sequence. AVAILABILITY: Online predictions based on this method are available at
format	Text
id	pubmed-550660
institution	National Center for Biotechnology Information
language	English
publishDate	2005
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-5506602005-02-27 PSSM-based prediction of DNA binding sites in proteins Ahmad, Shandar Sarai, Akinori BMC Bioinformatics Research Article BACKGROUND: Detection of DNA-binding sites in proteins is of enormous interest for technologies targeting gene regulation and manipulation. We have previously shown that a residue and its sequence neighbor information can be used to predict DNA-binding candidates in a protein sequence. This sequence-based prediction method is applicable even if no sequence homology with a previously known DNA-binding protein is observed. Here we implement a neural network based algorithm to utilize evolutionary information of amino acid sequences in terms of their position specific scoring matrices (PSSMs) for a better prediction of DNA-binding sites. RESULTS: An average of sensitivity and specificity using PSSMs is up to 8.7% better than the prediction with sequence information only. Much smaller data sets could be used to generate PSSM with minimal loss of prediction accuracy. CONCLUSION: One problem in using PSSM-derived prediction is obtaining lengthy and time-consuming alignments against large sequence databases. In order to speed up the process of generating PSSMs, we tried to use different reference data sets (sequence space) against which a target protein is scanned for PSI-BLAST iterations. We find that a very small set of proteins can actually be used as such a reference data without losing much of the prediction value. This makes the process of generating PSSMs very rapid and even amenable to be used at a genome level. A web server has been developed to provide these predictions of DNA-binding sites for any new protein from its amino acid sequence. AVAILABILITY: Online predictions based on this method are available at BioMed Central 2005-02-19 /pmc/articles/PMC550660/ /pubmed/15720719 http://dx.doi.org/10.1186/1471-2105-6-33 Text en Copyright © 2005 Ahmad and Sarai; licensee BioMed Central Ltd.
spellingShingle	Research Article Ahmad, Shandar Sarai, Akinori PSSM-based prediction of DNA binding sites in proteins
title	PSSM-based prediction of DNA binding sites in proteins
title_full	PSSM-based prediction of DNA binding sites in proteins
title_fullStr	PSSM-based prediction of DNA binding sites in proteins
title_full_unstemmed	PSSM-based prediction of DNA binding sites in proteins
title_short	PSSM-based prediction of DNA binding sites in proteins
title_sort	pssm-based prediction of dna binding sites in proteins
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC550660/ https://www.ncbi.nlm.nih.gov/pubmed/15720719 http://dx.doi.org/10.1186/1471-2105-6-33
work_keys_str_mv	AT ahmadshandar pssmbasedpredictionofdnabindingsitesinproteins AT saraiakinori pssmbasedpredictionofdnabindingsitesinproteins

PSSM-based prediction of DNA binding sites in proteins

Ejemplares similares