Cargando…

Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA

BACKGROUND: Upstream open reading frames (uORFs) in the 5′-untranslated regions (5′-UTRs) of certain eukaryotic mRNAs encode evolutionarily conserved functional peptides, such as cis-acting regulatory peptides that control translation of downstream main ORFs (mORFs). For genome-wide searches for uOR...

Descripción completa

Detalles Bibliográficos
Autores principales: Takahashi, Hiro, Hayashi, Noriya, Hiragori, Yuta, Sasaki, Shun, Motomura, Taichiro, Yamashita, Yui, Naito, Satoshi, Takahashi, Anna, Fuse, Kazuyuki, Satou, Kenji, Endo, Toshinori, Kojima, Shoko, Onouchi, Hitoshi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7106846/
https://www.ncbi.nlm.nih.gov/pubmed/32228449
http://dx.doi.org/10.1186/s12864-020-6662-5
_version_ 1783512700011151360
author Takahashi, Hiro
Hayashi, Noriya
Hiragori, Yuta
Sasaki, Shun
Motomura, Taichiro
Yamashita, Yui
Naito, Satoshi
Takahashi, Anna
Fuse, Kazuyuki
Satou, Kenji
Endo, Toshinori
Kojima, Shoko
Onouchi, Hitoshi
author_facet Takahashi, Hiro
Hayashi, Noriya
Hiragori, Yuta
Sasaki, Shun
Motomura, Taichiro
Yamashita, Yui
Naito, Satoshi
Takahashi, Anna
Fuse, Kazuyuki
Satou, Kenji
Endo, Toshinori
Kojima, Shoko
Onouchi, Hitoshi
author_sort Takahashi, Hiro
collection PubMed
description BACKGROUND: Upstream open reading frames (uORFs) in the 5′-untranslated regions (5′-UTRs) of certain eukaryotic mRNAs encode evolutionarily conserved functional peptides, such as cis-acting regulatory peptides that control translation of downstream main ORFs (mORFs). For genome-wide searches for uORFs with conserved peptide sequences (CPuORFs), comparative genomic studies have been conducted, in which uORF sequences were compared between selected species. To increase chances of identifying CPuORFs, we previously developed an approach in which uORF sequences were compared using BLAST between Arabidopsis and any other plant species with available transcript sequence databases. If this approach is applied to multiple plant species belonging to phylogenetically distant clades, it is expected to further comprehensively identify CPuORFs conserved in various plant lineages, including those conserved among relatively small taxonomic groups. RESULTS: To efficiently compare uORF sequences among many species and efficiently identify CPuORFs conserved in various taxonomic lineages, we developed a novel pipeline, ESUCA. We applied ESUCA to the genomes of five angiosperm species, which belong to phylogenetically distant clades, and selected CPuORFs conserved among at least three different orders. Through these analyses, we identified 89 novel CPuORF families. As expected, ESUCA analysis of each of the five angiosperm genomes identified many CPuORFs that were not identified from ESUCA analyses of the other four species. However, unexpectedly, these CPuORFs include those conserved across wide taxonomic ranges, indicating that the approach used here is useful not only for comprehensive identification of narrowly conserved CPuORFs but also for that of widely conserved CPuORFs. Examination of the effects of 11 selected CPuORFs on mORF translation revealed that CPuORFs conserved only in relatively narrow taxonomic ranges can have sequence-dependent regulatory effects, suggesting that most of the identified CPuORFs are conserved because of functional constraints of their encoded peptides. CONCLUSIONS: This study demonstrates that ESUCA is capable of efficiently identifying CPuORFs likely to be conserved because of the functional importance of their encoded peptides. Furthermore, our data show that the approach in which uORF sequences from multiple species are compared with those of many other species, using ESUCA, is highly effective in comprehensively identifying CPuORFs conserved in various taxonomic ranges.
format Online
Article
Text
id pubmed-7106846
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-71068462020-04-01 Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA Takahashi, Hiro Hayashi, Noriya Hiragori, Yuta Sasaki, Shun Motomura, Taichiro Yamashita, Yui Naito, Satoshi Takahashi, Anna Fuse, Kazuyuki Satou, Kenji Endo, Toshinori Kojima, Shoko Onouchi, Hitoshi BMC Genomics Research Article BACKGROUND: Upstream open reading frames (uORFs) in the 5′-untranslated regions (5′-UTRs) of certain eukaryotic mRNAs encode evolutionarily conserved functional peptides, such as cis-acting regulatory peptides that control translation of downstream main ORFs (mORFs). For genome-wide searches for uORFs with conserved peptide sequences (CPuORFs), comparative genomic studies have been conducted, in which uORF sequences were compared between selected species. To increase chances of identifying CPuORFs, we previously developed an approach in which uORF sequences were compared using BLAST between Arabidopsis and any other plant species with available transcript sequence databases. If this approach is applied to multiple plant species belonging to phylogenetically distant clades, it is expected to further comprehensively identify CPuORFs conserved in various plant lineages, including those conserved among relatively small taxonomic groups. RESULTS: To efficiently compare uORF sequences among many species and efficiently identify CPuORFs conserved in various taxonomic lineages, we developed a novel pipeline, ESUCA. We applied ESUCA to the genomes of five angiosperm species, which belong to phylogenetically distant clades, and selected CPuORFs conserved among at least three different orders. Through these analyses, we identified 89 novel CPuORF families. As expected, ESUCA analysis of each of the five angiosperm genomes identified many CPuORFs that were not identified from ESUCA analyses of the other four species. However, unexpectedly, these CPuORFs include those conserved across wide taxonomic ranges, indicating that the approach used here is useful not only for comprehensive identification of narrowly conserved CPuORFs but also for that of widely conserved CPuORFs. Examination of the effects of 11 selected CPuORFs on mORF translation revealed that CPuORFs conserved only in relatively narrow taxonomic ranges can have sequence-dependent regulatory effects, suggesting that most of the identified CPuORFs are conserved because of functional constraints of their encoded peptides. CONCLUSIONS: This study demonstrates that ESUCA is capable of efficiently identifying CPuORFs likely to be conserved because of the functional importance of their encoded peptides. Furthermore, our data show that the approach in which uORF sequences from multiple species are compared with those of many other species, using ESUCA, is highly effective in comprehensively identifying CPuORFs conserved in various taxonomic ranges. BioMed Central 2020-03-30 /pmc/articles/PMC7106846/ /pubmed/32228449 http://dx.doi.org/10.1186/s12864-020-6662-5 Text en © The Author(s). 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Takahashi, Hiro
Hayashi, Noriya
Hiragori, Yuta
Sasaki, Shun
Motomura, Taichiro
Yamashita, Yui
Naito, Satoshi
Takahashi, Anna
Fuse, Kazuyuki
Satou, Kenji
Endo, Toshinori
Kojima, Shoko
Onouchi, Hitoshi
Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA
title Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA
title_full Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA
title_fullStr Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA
title_full_unstemmed Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA
title_short Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA
title_sort comprehensive genome-wide identification of angiosperm upstream orfs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, esuca
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7106846/
https://www.ncbi.nlm.nih.gov/pubmed/32228449
http://dx.doi.org/10.1186/s12864-020-6662-5
work_keys_str_mv AT takahashihiro comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT hayashinoriya comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT hiragoriyuta comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT sasakishun comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT motomurataichiro comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT yamashitayui comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT naitosatoshi comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT takahashianna comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT fusekazuyuki comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT satoukenji comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT endotoshinori comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT kojimashoko comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca
AT onouchihitoshi comprehensivegenomewideidentificationofangiospermupstreamorfswithpeptidesequencesconservedinvarioustaxonomicrangesusinganovelpipelineesuca