Cargando…

Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae

Extensive transcriptional activity occurring in intergenic regions of genomes has raised the question whether intergenic transcription represents the activity of novel genes or noisy expression. To address this, we evaluated cross-species and post-duplication sequence and expression conservation of...

Descripción completa

Detalles Bibliográficos
Autores principales: Lloyd, John P., Bowman, Megan J., Azodi, Christina B., Sowers, Rosalie P., Moghe, Gaurav D., Childs, Kevin L., Shiu, Shin-Han
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6702216/
https://www.ncbi.nlm.nih.gov/pubmed/31431676
http://dx.doi.org/10.1038/s41598-019-47797-y
_version_ 1783445179153252352
author Lloyd, John P.
Bowman, Megan J.
Azodi, Christina B.
Sowers, Rosalie P.
Moghe, Gaurav D.
Childs, Kevin L.
Shiu, Shin-Han
author_facet Lloyd, John P.
Bowman, Megan J.
Azodi, Christina B.
Sowers, Rosalie P.
Moghe, Gaurav D.
Childs, Kevin L.
Shiu, Shin-Han
author_sort Lloyd, John P.
collection PubMed
description Extensive transcriptional activity occurring in intergenic regions of genomes has raised the question whether intergenic transcription represents the activity of novel genes or noisy expression. To address this, we evaluated cross-species and post-duplication sequence and expression conservation of intergenic transcribed regions (ITRs) in four Poaceae species. Among 43,301 ITRs across the four species, 34,460 (80%) are species-specific. ITRs found across species tend to be more divergent in expression and have more recent duplicates compared to annotated genes. To assess if ITRs are functional (under selection), machine learning models were established in Oryza sativa (rice) that could accurately distinguish between phenotype genes and pseudogenes (area under curve-receiver operating characteristic = 0.94). Based on the models, 584 (8%) and 4391 (61%) rice ITRs are classified as likely functional and nonfunctional with high confidence, respectively. ITRs with conserved expression and ancient retained duplicates, features that were not part of the model, are frequently classified as likely-functional, suggesting these characteristics could serve as pragmatic rules of thumb for identifying candidate sequences likely to be under selection. This study also provides a framework to identify novel genes using comparative transcriptomic data to improve genome annotation that is fundamental for connecting genotype to phenotype in crop and model systems.
format Online
Article
Text
id pubmed-6702216
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-67022162019-08-23 Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae Lloyd, John P. Bowman, Megan J. Azodi, Christina B. Sowers, Rosalie P. Moghe, Gaurav D. Childs, Kevin L. Shiu, Shin-Han Sci Rep Article Extensive transcriptional activity occurring in intergenic regions of genomes has raised the question whether intergenic transcription represents the activity of novel genes or noisy expression. To address this, we evaluated cross-species and post-duplication sequence and expression conservation of intergenic transcribed regions (ITRs) in four Poaceae species. Among 43,301 ITRs across the four species, 34,460 (80%) are species-specific. ITRs found across species tend to be more divergent in expression and have more recent duplicates compared to annotated genes. To assess if ITRs are functional (under selection), machine learning models were established in Oryza sativa (rice) that could accurately distinguish between phenotype genes and pseudogenes (area under curve-receiver operating characteristic = 0.94). Based on the models, 584 (8%) and 4391 (61%) rice ITRs are classified as likely functional and nonfunctional with high confidence, respectively. ITRs with conserved expression and ancient retained duplicates, features that were not part of the model, are frequently classified as likely-functional, suggesting these characteristics could serve as pragmatic rules of thumb for identifying candidate sequences likely to be under selection. This study also provides a framework to identify novel genes using comparative transcriptomic data to improve genome annotation that is fundamental for connecting genotype to phenotype in crop and model systems. Nature Publishing Group UK 2019-08-20 /pmc/articles/PMC6702216/ /pubmed/31431676 http://dx.doi.org/10.1038/s41598-019-47797-y Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Lloyd, John P.
Bowman, Megan J.
Azodi, Christina B.
Sowers, Rosalie P.
Moghe, Gaurav D.
Childs, Kevin L.
Shiu, Shin-Han
Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title_full Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title_fullStr Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title_full_unstemmed Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title_short Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title_sort evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the poaceae
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6702216/
https://www.ncbi.nlm.nih.gov/pubmed/31431676
http://dx.doi.org/10.1038/s41598-019-47797-y
work_keys_str_mv AT lloydjohnp evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT bowmanmeganj evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT azodichristinab evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT sowersrosaliep evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT moghegauravd evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT childskevinl evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT shiushinhan evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae