Cargando…

Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites

Position-weight matrices (PWMs) are broadly used to locate transcription factor binding sites in DNA sequences. The majority of existing PWMs provide a low level of both sensitivity and specificity. We present a new computational algorithm, a modification of the Staden–Bucher approach, that improves...

Descripción completa

Detalles Bibliográficos
Autores principales: Gershenzon, Naum I., Stormo, Gary D., Ioshikhes, Ilya P.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1084321/
https://www.ncbi.nlm.nih.gov/pubmed/15849315
http://dx.doi.org/10.1093/nar/gki519
_version_ 1782123791617359872
author Gershenzon, Naum I.
Stormo, Gary D.
Ioshikhes, Ilya P.
author_facet Gershenzon, Naum I.
Stormo, Gary D.
Ioshikhes, Ilya P.
author_sort Gershenzon, Naum I.
collection PubMed
description Position-weight matrices (PWMs) are broadly used to locate transcription factor binding sites in DNA sequences. The majority of existing PWMs provide a low level of both sensitivity and specificity. We present a new computational algorithm, a modification of the Staden–Bucher approach, that improves the PWM. We applied the proposed technique on the PWM of the GC-box, binding site for Sp1. The comparison of old and new PWMs shows that the latter increase both sensitivity and specificity. The statistical parameters of GC-box distribution in promoter regions and in the human genome, as well as in each chromosome, are presented. The majority of commonly used PWMs are the 4-row mononucleotide matrices, although 16-row dinucleotide matrices are known to be more informative. The algorithm efficiently determines the 16-row matrices and preliminary results show that such matrices provide better results than 4-row matrices.
format Text
id pubmed-1084321
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-10843212005-04-22 Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites Gershenzon, Naum I. Stormo, Gary D. Ioshikhes, Ilya P. Nucleic Acids Res Article Position-weight matrices (PWMs) are broadly used to locate transcription factor binding sites in DNA sequences. The majority of existing PWMs provide a low level of both sensitivity and specificity. We present a new computational algorithm, a modification of the Staden–Bucher approach, that improves the PWM. We applied the proposed technique on the PWM of the GC-box, binding site for Sp1. The comparison of old and new PWMs shows that the latter increase both sensitivity and specificity. The statistical parameters of GC-box distribution in promoter regions and in the human genome, as well as in each chromosome, are presented. The majority of commonly used PWMs are the 4-row mononucleotide matrices, although 16-row dinucleotide matrices are known to be more informative. The algorithm efficiently determines the 16-row matrices and preliminary results show that such matrices provide better results than 4-row matrices. Oxford University Press 2005 2005-04-22 /pmc/articles/PMC1084321/ /pubmed/15849315 http://dx.doi.org/10.1093/nar/gki519 Text en © The Author 2005. Published by Oxford University Press. All rights reserved
spellingShingle Article
Gershenzon, Naum I.
Stormo, Gary D.
Ioshikhes, Ilya P.
Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites
title Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites
title_full Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites
title_fullStr Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites
title_full_unstemmed Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites
title_short Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites
title_sort computational technique for improvement of the position-weight matrices for the dna/protein binding sites
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1084321/
https://www.ncbi.nlm.nih.gov/pubmed/15849315
http://dx.doi.org/10.1093/nar/gki519
work_keys_str_mv AT gershenzonnaumi computationaltechniqueforimprovementofthepositionweightmatricesforthednaproteinbindingsites
AT stormogaryd computationaltechniqueforimprovementofthepositionweightmatricesforthednaproteinbindingsites
AT ioshikhesilyap computationaltechniqueforimprovementofthepositionweightmatricesforthednaproteinbindingsites