Cargando…
OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes
Advances in proteomics and sequencing have highlighted many non-annotated open reading frames (ORFs) in eukaryotic genomes. Genome annotations, cornerstones of today's research, mostly rely on protein prior knowledge and on ab initio prediction algorithms. Such algorithms notably enforce an arb...
Autores principales: | , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6323990/ https://www.ncbi.nlm.nih.gov/pubmed/30299502 http://dx.doi.org/10.1093/nar/gky936 |
_version_ | 1783385886241587200 |
---|---|
author | Brunet, Marie A Brunelle, Mylène Lucier, Jean-François Delcourt, Vivian Levesque, Maxime Grenier, Frédéric Samandi, Sondos Leblanc, Sébastien Aguilar, Jean-David Dufour, Pascal Jacques, Jean-Francois Fournier, Isabelle Ouangraoua, Aida Scott, Michelle S Boisvert, François-Michel Roucou, Xavier |
author_facet | Brunet, Marie A Brunelle, Mylène Lucier, Jean-François Delcourt, Vivian Levesque, Maxime Grenier, Frédéric Samandi, Sondos Leblanc, Sébastien Aguilar, Jean-David Dufour, Pascal Jacques, Jean-Francois Fournier, Isabelle Ouangraoua, Aida Scott, Michelle S Boisvert, François-Michel Roucou, Xavier |
author_sort | Brunet, Marie A |
collection | PubMed |
description | Advances in proteomics and sequencing have highlighted many non-annotated open reading frames (ORFs) in eukaryotic genomes. Genome annotations, cornerstones of today's research, mostly rely on protein prior knowledge and on ab initio prediction algorithms. Such algorithms notably enforce an arbitrary criterion of one coding sequence (CDS) per transcript, leading to a substantial underestimation of the coding potential of eukaryotes. Here, we present OpenProt, the first database fully endorsing a polycistronic model of eukaryotic genomes to date. OpenProt contains all possible ORFs longer than 30 codons across 10 species, and cumulates supporting evidence such as protein conservation, translation and expression. OpenProt annotates all known proteins (RefProts), novel predicted isoforms (Isoforms) and novel predicted proteins from alternative ORFs (AltProts). It incorporates cutting-edge algorithms to evaluate protein orthology and re-interrogate publicly available ribosome profiling and mass spectrometry datasets, supporting the annotation of thousands of predicted ORFs. The constantly growing database currently cumulates evidence from 87 ribosome profiling and 114 mass spectrometry studies from several species, tissues and cell lines. All data is freely available and downloadable from a web platform (www.openprot.org) supporting a genome browser and advanced queries for each species. Thus, OpenProt enables a more comprehensive landscape of eukaryotic genomes’ coding potential. |
format | Online Article Text |
id | pubmed-6323990 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-63239902019-01-10 OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes Brunet, Marie A Brunelle, Mylène Lucier, Jean-François Delcourt, Vivian Levesque, Maxime Grenier, Frédéric Samandi, Sondos Leblanc, Sébastien Aguilar, Jean-David Dufour, Pascal Jacques, Jean-Francois Fournier, Isabelle Ouangraoua, Aida Scott, Michelle S Boisvert, François-Michel Roucou, Xavier Nucleic Acids Res Database Issue Advances in proteomics and sequencing have highlighted many non-annotated open reading frames (ORFs) in eukaryotic genomes. Genome annotations, cornerstones of today's research, mostly rely on protein prior knowledge and on ab initio prediction algorithms. Such algorithms notably enforce an arbitrary criterion of one coding sequence (CDS) per transcript, leading to a substantial underestimation of the coding potential of eukaryotes. Here, we present OpenProt, the first database fully endorsing a polycistronic model of eukaryotic genomes to date. OpenProt contains all possible ORFs longer than 30 codons across 10 species, and cumulates supporting evidence such as protein conservation, translation and expression. OpenProt annotates all known proteins (RefProts), novel predicted isoforms (Isoforms) and novel predicted proteins from alternative ORFs (AltProts). It incorporates cutting-edge algorithms to evaluate protein orthology and re-interrogate publicly available ribosome profiling and mass spectrometry datasets, supporting the annotation of thousands of predicted ORFs. The constantly growing database currently cumulates evidence from 87 ribosome profiling and 114 mass spectrometry studies from several species, tissues and cell lines. All data is freely available and downloadable from a web platform (www.openprot.org) supporting a genome browser and advanced queries for each species. Thus, OpenProt enables a more comprehensive landscape of eukaryotic genomes’ coding potential. Oxford University Press 2019-01-08 2018-10-09 /pmc/articles/PMC6323990/ /pubmed/30299502 http://dx.doi.org/10.1093/nar/gky936 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Database Issue Brunet, Marie A Brunelle, Mylène Lucier, Jean-François Delcourt, Vivian Levesque, Maxime Grenier, Frédéric Samandi, Sondos Leblanc, Sébastien Aguilar, Jean-David Dufour, Pascal Jacques, Jean-Francois Fournier, Isabelle Ouangraoua, Aida Scott, Michelle S Boisvert, François-Michel Roucou, Xavier OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes |
title | OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes |
title_full | OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes |
title_fullStr | OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes |
title_full_unstemmed | OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes |
title_short | OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes |
title_sort | openprot: a more comprehensive guide to explore eukaryotic coding potential and proteomes |
topic | Database Issue |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6323990/ https://www.ncbi.nlm.nih.gov/pubmed/30299502 http://dx.doi.org/10.1093/nar/gky936 |
work_keys_str_mv | AT brunetmariea openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT brunellemylene openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT lucierjeanfrancois openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT delcourtvivian openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT levesquemaxime openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT grenierfrederic openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT samandisondos openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT leblancsebastien openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT aguilarjeandavid openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT dufourpascal openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT jacquesjeanfrancois openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT fournierisabelle openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT ouangraouaaida openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT scottmichelles openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT boisvertfrancoismichel openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes AT roucouxavier openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes |