Cargando…

Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific fun...

Descripción completa

Detalles Bibliográficos
Autores principales: Turco, Gina, Schnable, James C., Pedersen, Brent, Freeling, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3708275/
https://www.ncbi.nlm.nih.gov/pubmed/23874343
http://dx.doi.org/10.3389/fpls.2013.00170
_version_ 1782276594410192896
author Turco, Gina
Schnable, James C.
Pedersen, Brent
Freeling, Michael
author_facet Turco, Gina
Schnable, James C.
Pedersen, Brent
Freeling, Michael
author_sort Turco, Gina
collection PubMed
description Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize.
format Online
Article
Text
id pubmed-3708275
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-37082752013-07-19 Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses Turco, Gina Schnable, James C. Pedersen, Brent Freeling, Michael Front Plant Sci Plant Science Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. Frontiers Media S.A. 2013-07-02 /pmc/articles/PMC3708275/ /pubmed/23874343 http://dx.doi.org/10.3389/fpls.2013.00170 Text en Copyright © 2013 Turco, Schnable, Pedersen and Freeling. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
spellingShingle Plant Science
Turco, Gina
Schnable, James C.
Pedersen, Brent
Freeling, Michael
Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses
title Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses
title_full Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses
title_fullStr Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses
title_full_unstemmed Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses
title_short Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses
title_sort automated conserved non-coding sequence (cns) discovery reveals differences in gene content and promoter evolution among grasses
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3708275/
https://www.ncbi.nlm.nih.gov/pubmed/23874343
http://dx.doi.org/10.3389/fpls.2013.00170
work_keys_str_mv AT turcogina automatedconservednoncodingsequencecnsdiscoveryrevealsdifferencesingenecontentandpromoterevolutionamonggrasses
AT schnablejamesc automatedconservednoncodingsequencecnsdiscoveryrevealsdifferencesingenecontentandpromoterevolutionamonggrasses
AT pedersenbrent automatedconservednoncodingsequencecnsdiscoveryrevealsdifferencesingenecontentandpromoterevolutionamonggrasses
AT freelingmichael automatedconservednoncodingsequencecnsdiscoveryrevealsdifferencesingenecontentandpromoterevolutionamonggrasses