Cargando…
Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites
Position-weight matrices (PWMs) are broadly used to locate transcription factor binding sites in DNA sequences. The majority of existing PWMs provide a low level of both sensitivity and specificity. We present a new computational algorithm, a modification of the Staden–Bucher approach, that improves...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2005
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1084321/ https://www.ncbi.nlm.nih.gov/pubmed/15849315 http://dx.doi.org/10.1093/nar/gki519 |
_version_ | 1782123791617359872 |
---|---|
author | Gershenzon, Naum I. Stormo, Gary D. Ioshikhes, Ilya P. |
author_facet | Gershenzon, Naum I. Stormo, Gary D. Ioshikhes, Ilya P. |
author_sort | Gershenzon, Naum I. |
collection | PubMed |
description | Position-weight matrices (PWMs) are broadly used to locate transcription factor binding sites in DNA sequences. The majority of existing PWMs provide a low level of both sensitivity and specificity. We present a new computational algorithm, a modification of the Staden–Bucher approach, that improves the PWM. We applied the proposed technique on the PWM of the GC-box, binding site for Sp1. The comparison of old and new PWMs shows that the latter increase both sensitivity and specificity. The statistical parameters of GC-box distribution in promoter regions and in the human genome, as well as in each chromosome, are presented. The majority of commonly used PWMs are the 4-row mononucleotide matrices, although 16-row dinucleotide matrices are known to be more informative. The algorithm efficiently determines the 16-row matrices and preliminary results show that such matrices provide better results than 4-row matrices. |
format | Text |
id | pubmed-1084321 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2005 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-10843212005-04-22 Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites Gershenzon, Naum I. Stormo, Gary D. Ioshikhes, Ilya P. Nucleic Acids Res Article Position-weight matrices (PWMs) are broadly used to locate transcription factor binding sites in DNA sequences. The majority of existing PWMs provide a low level of both sensitivity and specificity. We present a new computational algorithm, a modification of the Staden–Bucher approach, that improves the PWM. We applied the proposed technique on the PWM of the GC-box, binding site for Sp1. The comparison of old and new PWMs shows that the latter increase both sensitivity and specificity. The statistical parameters of GC-box distribution in promoter regions and in the human genome, as well as in each chromosome, are presented. The majority of commonly used PWMs are the 4-row mononucleotide matrices, although 16-row dinucleotide matrices are known to be more informative. The algorithm efficiently determines the 16-row matrices and preliminary results show that such matrices provide better results than 4-row matrices. Oxford University Press 2005 2005-04-22 /pmc/articles/PMC1084321/ /pubmed/15849315 http://dx.doi.org/10.1093/nar/gki519 Text en © The Author 2005. Published by Oxford University Press. All rights reserved |
spellingShingle | Article Gershenzon, Naum I. Stormo, Gary D. Ioshikhes, Ilya P. Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites |
title | Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites |
title_full | Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites |
title_fullStr | Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites |
title_full_unstemmed | Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites |
title_short | Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites |
title_sort | computational technique for improvement of the position-weight matrices for the dna/protein binding sites |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1084321/ https://www.ncbi.nlm.nih.gov/pubmed/15849315 http://dx.doi.org/10.1093/nar/gki519 |
work_keys_str_mv | AT gershenzonnaumi computationaltechniqueforimprovementofthepositionweightmatricesforthednaproteinbindingsites AT stormogaryd computationaltechniqueforimprovementofthepositionweightmatricesforthednaproteinbindingsites AT ioshikhesilyap computationaltechniqueforimprovementofthepositionweightmatricesforthednaproteinbindingsites |