Cargando…
PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION
Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually repr...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3561985/ https://www.ncbi.nlm.nih.gov/pubmed/23241390 http://dx.doi.org/10.1093/nar/gks1286 |
_version_ | 1782258027778277376 |
---|---|
author | Elati, Mohamed Nicolle, Rémy Junier, Ivan Fernández, David Fekih, Rim Font, Julio Képès, François |
author_facet | Elati, Mohamed Nicolle, Rémy Junier, Ivan Fernández, David Fekih, Rim Font, Julio Képès, François |
author_sort | Elati, Mohamed |
collection | PubMed |
description | Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases. |
format | Online Article Text |
id | pubmed-3561985 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-35619852013-02-01 PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION Elati, Mohamed Nicolle, Rémy Junier, Ivan Fernández, David Fekih, Rim Font, Julio Képès, François Nucleic Acids Res Computational Biology Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases. Oxford University Press 2013-02 2012-12-14 /pmc/articles/PMC3561985/ /pubmed/23241390 http://dx.doi.org/10.1093/nar/gks1286 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com. |
spellingShingle | Computational Biology Elati, Mohamed Nicolle, Rémy Junier, Ivan Fernández, David Fekih, Rim Font, Julio Képès, François PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION |
title | PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION |
title_full | PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION |
title_fullStr | PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION |
title_full_unstemmed | PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION |
title_short | PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION |
title_sort | precision: prediction of cis-regulatory elements improved by gene’s position |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3561985/ https://www.ncbi.nlm.nih.gov/pubmed/23241390 http://dx.doi.org/10.1093/nar/gks1286 |
work_keys_str_mv | AT elatimohamed precisionpredictionofcisregulatoryelementsimprovedbygenesposition AT nicolleremy precisionpredictionofcisregulatoryelementsimprovedbygenesposition AT junierivan precisionpredictionofcisregulatoryelementsimprovedbygenesposition AT fernandezdavid precisionpredictionofcisregulatoryelementsimprovedbygenesposition AT fekihrim precisionpredictionofcisregulatoryelementsimprovedbygenesposition AT fontjulio precisionpredictionofcisregulatoryelementsimprovedbygenesposition AT kepesfrancois precisionpredictionofcisregulatoryelementsimprovedbygenesposition |