Cargando…

Computational analysis of core promoters in the Drosophila genome

BACKGROUND: The core promoter, a region of about 100 base-pairs flanking the transcription start site (TSS), serves as the recognition site for the basal transcription apparatus. Drosophila TSSs have generally been mapped by individual experiments; the low number of accurately mapped TSSs has limite...

Descripción completa

Detalles Bibliográficos
Autores principales: Ohler, Uwe, Liao, Guo-chun, Niemann, Heinrich, Rubin, Gerald M
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2002
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC151189/
https://www.ncbi.nlm.nih.gov/pubmed/12537576
http://dx.doi.org/10.1186/gb-2002-3-12-research0087
_version_ 1782120664681938944
author Ohler, Uwe
Liao, Guo-chun
Niemann, Heinrich
Rubin, Gerald M
author_facet Ohler, Uwe
Liao, Guo-chun
Niemann, Heinrich
Rubin, Gerald M
author_sort Ohler, Uwe
collection PubMed
description BACKGROUND: The core promoter, a region of about 100 base-pairs flanking the transcription start site (TSS), serves as the recognition site for the basal transcription apparatus. Drosophila TSSs have generally been mapped by individual experiments; the low number of accurately mapped TSSs has limited analysis of promoter sequence motifs and the training of computational prediction tools. RESULTS: We identified TSS candidates for about 2,000 Drosophila genes by aligning 5' expressed sequence tags (ESTs) from cap-trapped cDNA libraries to the genome, while applying stringent criteria concerning coverage and 5'-end distribution. Examination of the sequences flanking these TSSs revealed the presence of well-known core promoter motifs such as the TATA box, the initiator and the downstream promoter element (DPE). We also define, and assess the distribution of, several new motifs prevalent in core promoters, including what appears to be a variant DPE motif. Among the prevalent motifs is the DNA-replication-related element DRE, recently shown to be part of the recognition site for the TBP-related factor TRF2. Our TSS set was then used to retrain the computational promoter predictor McPromoter, allowing us to improve the recognition performance to over 50% sensitivity and 40% specificity. We compare these computational results to promoter prediction in vertebrates. CONCLUSIONS: There are relatively few recognizable binding sites for previously known general transcription factors in Drosophila core promoters. However, we identified several new motifs enriched in promoter regions. We were also able to significantly improve the performance of computational TSS prediction in Drosophila.
format Text
id pubmed-151189
institution National Center for Biotechnology Information
language English
publishDate 2002
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-1511892003-03-13 Computational analysis of core promoters in the Drosophila genome Ohler, Uwe Liao, Guo-chun Niemann, Heinrich Rubin, Gerald M Genome Biol Research BACKGROUND: The core promoter, a region of about 100 base-pairs flanking the transcription start site (TSS), serves as the recognition site for the basal transcription apparatus. Drosophila TSSs have generally been mapped by individual experiments; the low number of accurately mapped TSSs has limited analysis of promoter sequence motifs and the training of computational prediction tools. RESULTS: We identified TSS candidates for about 2,000 Drosophila genes by aligning 5' expressed sequence tags (ESTs) from cap-trapped cDNA libraries to the genome, while applying stringent criteria concerning coverage and 5'-end distribution. Examination of the sequences flanking these TSSs revealed the presence of well-known core promoter motifs such as the TATA box, the initiator and the downstream promoter element (DPE). We also define, and assess the distribution of, several new motifs prevalent in core promoters, including what appears to be a variant DPE motif. Among the prevalent motifs is the DNA-replication-related element DRE, recently shown to be part of the recognition site for the TBP-related factor TRF2. Our TSS set was then used to retrain the computational promoter predictor McPromoter, allowing us to improve the recognition performance to over 50% sensitivity and 40% specificity. We compare these computational results to promoter prediction in vertebrates. CONCLUSIONS: There are relatively few recognizable binding sites for previously known general transcription factors in Drosophila core promoters. However, we identified several new motifs enriched in promoter regions. We were also able to significantly improve the performance of computational TSS prediction in Drosophila. BioMed Central 2002 2002-12-20 /pmc/articles/PMC151189/ /pubmed/12537576 http://dx.doi.org/10.1186/gb-2002-3-12-research0087 Text en Copyright © 2002 Ohler et al., licensee BioMed Central Ltd
spellingShingle Research
Ohler, Uwe
Liao, Guo-chun
Niemann, Heinrich
Rubin, Gerald M
Computational analysis of core promoters in the Drosophila genome
title Computational analysis of core promoters in the Drosophila genome
title_full Computational analysis of core promoters in the Drosophila genome
title_fullStr Computational analysis of core promoters in the Drosophila genome
title_full_unstemmed Computational analysis of core promoters in the Drosophila genome
title_short Computational analysis of core promoters in the Drosophila genome
title_sort computational analysis of core promoters in the drosophila genome
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC151189/
https://www.ncbi.nlm.nih.gov/pubmed/12537576
http://dx.doi.org/10.1186/gb-2002-3-12-research0087
work_keys_str_mv AT ohleruwe computationalanalysisofcorepromotersinthedrosophilagenome
AT liaoguochun computationalanalysisofcorepromotersinthedrosophilagenome
AT niemannheinrich computationalanalysisofcorepromotersinthedrosophilagenome
AT rubingeraldm computationalanalysisofcorepromotersinthedrosophilagenome