Cargando…

A unified model for yeast transcript definition

Identifying genes in the genomic context is central to a cell's ability to interpret the genome. Yet, in general, the signals used to define eukaryotic genes are poorly described. Here, we derived simple classifiers that identify where transcription will initiate and terminate using nucleic aci...

Descripción completa

Detalles Bibliográficos
Autores principales: de Boer, Carl G., van Bakel, Harm, Tsui, Kyle, Li, Joyce, Morris, Quaid D., Nislow, Corey, Greenblatt, Jack F., Hughes, Timothy R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3875857/
https://www.ncbi.nlm.nih.gov/pubmed/24170600
http://dx.doi.org/10.1101/gr.164327.113
_version_ 1782297416772354048
author de Boer, Carl G.
van Bakel, Harm
Tsui, Kyle
Li, Joyce
Morris, Quaid D.
Nislow, Corey
Greenblatt, Jack F.
Hughes, Timothy R.
author_facet de Boer, Carl G.
van Bakel, Harm
Tsui, Kyle
Li, Joyce
Morris, Quaid D.
Nislow, Corey
Greenblatt, Jack F.
Hughes, Timothy R.
author_sort de Boer, Carl G.
collection PubMed
description Identifying genes in the genomic context is central to a cell's ability to interpret the genome. Yet, in general, the signals used to define eukaryotic genes are poorly described. Here, we derived simple classifiers that identify where transcription will initiate and terminate using nucleic acid sequence features detectable by the yeast cell, which we integrate into a Unified Model (UM) that models transcription as a whole. The cis-elements that denote where transcription initiates function primarily through nucleosome depletion, and, using a synthetic promoter system, we show that most of these elements are sufficient to initiate transcription in vivo. Hrp1 binding sites are the major characteristic of terminators; these binding sites are often clustered in terminator regions and can terminate transcription bidirectionally. The UM predicts global transcript structure by modeling transcription of the genome using a hidden Markov model whose emissions are the outputs of the initiation and termination classifiers. We validated the novel predictions of the UM with available RNA-seq data and tested it further by directly comparing the transcript structure predicted by the model to the transcription generated by the cell for synthetic DNA segments of random design. We show that the UM identifies transcription start sites more accurately than the initiation classifier alone, indicating that the relative arrangement of promoter and terminator elements influences their function. Our model presents a concrete description of how the cell defines transcript units, explains the existence of nongenic transcripts, and provides insight into genome evolution.
format Online
Article
Text
id pubmed-3875857
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-38758572014-01-07 A unified model for yeast transcript definition de Boer, Carl G. van Bakel, Harm Tsui, Kyle Li, Joyce Morris, Quaid D. Nislow, Corey Greenblatt, Jack F. Hughes, Timothy R. Genome Res Method Identifying genes in the genomic context is central to a cell's ability to interpret the genome. Yet, in general, the signals used to define eukaryotic genes are poorly described. Here, we derived simple classifiers that identify where transcription will initiate and terminate using nucleic acid sequence features detectable by the yeast cell, which we integrate into a Unified Model (UM) that models transcription as a whole. The cis-elements that denote where transcription initiates function primarily through nucleosome depletion, and, using a synthetic promoter system, we show that most of these elements are sufficient to initiate transcription in vivo. Hrp1 binding sites are the major characteristic of terminators; these binding sites are often clustered in terminator regions and can terminate transcription bidirectionally. The UM predicts global transcript structure by modeling transcription of the genome using a hidden Markov model whose emissions are the outputs of the initiation and termination classifiers. We validated the novel predictions of the UM with available RNA-seq data and tested it further by directly comparing the transcript structure predicted by the model to the transcription generated by the cell for synthetic DNA segments of random design. We show that the UM identifies transcription start sites more accurately than the initiation classifier alone, indicating that the relative arrangement of promoter and terminator elements influences their function. Our model presents a concrete description of how the cell defines transcript units, explains the existence of nongenic transcripts, and provides insight into genome evolution. Cold Spring Harbor Laboratory Press 2014-01 /pmc/articles/PMC3875857/ /pubmed/24170600 http://dx.doi.org/10.1101/gr.164327.113 Text en © 2014 de Boer et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/3.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 3.0 Unported), as described at http://creativecommons.org/licenses/by/3.0.
spellingShingle Method
de Boer, Carl G.
van Bakel, Harm
Tsui, Kyle
Li, Joyce
Morris, Quaid D.
Nislow, Corey
Greenblatt, Jack F.
Hughes, Timothy R.
A unified model for yeast transcript definition
title A unified model for yeast transcript definition
title_full A unified model for yeast transcript definition
title_fullStr A unified model for yeast transcript definition
title_full_unstemmed A unified model for yeast transcript definition
title_short A unified model for yeast transcript definition
title_sort unified model for yeast transcript definition
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3875857/
https://www.ncbi.nlm.nih.gov/pubmed/24170600
http://dx.doi.org/10.1101/gr.164327.113
work_keys_str_mv AT deboercarlg aunifiedmodelforyeasttranscriptdefinition
AT vanbakelharm aunifiedmodelforyeasttranscriptdefinition
AT tsuikyle aunifiedmodelforyeasttranscriptdefinition
AT lijoyce aunifiedmodelforyeasttranscriptdefinition
AT morrisquaidd aunifiedmodelforyeasttranscriptdefinition
AT nislowcorey aunifiedmodelforyeasttranscriptdefinition
AT greenblattjackf aunifiedmodelforyeasttranscriptdefinition
AT hughestimothyr aunifiedmodelforyeasttranscriptdefinition
AT deboercarlg unifiedmodelforyeasttranscriptdefinition
AT vanbakelharm unifiedmodelforyeasttranscriptdefinition
AT tsuikyle unifiedmodelforyeasttranscriptdefinition
AT lijoyce unifiedmodelforyeasttranscriptdefinition
AT morrisquaidd unifiedmodelforyeasttranscriptdefinition
AT nislowcorey unifiedmodelforyeasttranscriptdefinition
AT greenblattjackf unifiedmodelforyeasttranscriptdefinition
AT hughestimothyr unifiedmodelforyeasttranscriptdefinition