Cargando…

OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes

Advances in proteomics and sequencing have highlighted many non-annotated open reading frames (ORFs) in eukaryotic genomes. Genome annotations, cornerstones of today's research, mostly rely on protein prior knowledge and on ab initio prediction algorithms. Such algorithms notably enforce an arb...

Descripción completa

Detalles Bibliográficos
Autores principales: Brunet, Marie A, Brunelle, Mylène, Lucier, Jean-François, Delcourt, Vivian, Levesque, Maxime, Grenier, Frédéric, Samandi, Sondos, Leblanc, Sébastien, Aguilar, Jean-David, Dufour, Pascal, Jacques, Jean-Francois, Fournier, Isabelle, Ouangraoua, Aida, Scott, Michelle S, Boisvert, François-Michel, Roucou, Xavier
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6323990/
https://www.ncbi.nlm.nih.gov/pubmed/30299502
http://dx.doi.org/10.1093/nar/gky936
_version_ 1783385886241587200
author Brunet, Marie A
Brunelle, Mylène
Lucier, Jean-François
Delcourt, Vivian
Levesque, Maxime
Grenier, Frédéric
Samandi, Sondos
Leblanc, Sébastien
Aguilar, Jean-David
Dufour, Pascal
Jacques, Jean-Francois
Fournier, Isabelle
Ouangraoua, Aida
Scott, Michelle S
Boisvert, François-Michel
Roucou, Xavier
author_facet Brunet, Marie A
Brunelle, Mylène
Lucier, Jean-François
Delcourt, Vivian
Levesque, Maxime
Grenier, Frédéric
Samandi, Sondos
Leblanc, Sébastien
Aguilar, Jean-David
Dufour, Pascal
Jacques, Jean-Francois
Fournier, Isabelle
Ouangraoua, Aida
Scott, Michelle S
Boisvert, François-Michel
Roucou, Xavier
author_sort Brunet, Marie A
collection PubMed
description Advances in proteomics and sequencing have highlighted many non-annotated open reading frames (ORFs) in eukaryotic genomes. Genome annotations, cornerstones of today's research, mostly rely on protein prior knowledge and on ab initio prediction algorithms. Such algorithms notably enforce an arbitrary criterion of one coding sequence (CDS) per transcript, leading to a substantial underestimation of the coding potential of eukaryotes. Here, we present OpenProt, the first database fully endorsing a polycistronic model of eukaryotic genomes to date. OpenProt contains all possible ORFs longer than 30 codons across 10 species, and cumulates supporting evidence such as protein conservation, translation and expression. OpenProt annotates all known proteins (RefProts), novel predicted isoforms (Isoforms) and novel predicted proteins from alternative ORFs (AltProts). It incorporates cutting-edge algorithms to evaluate protein orthology and re-interrogate publicly available ribosome profiling and mass spectrometry datasets, supporting the annotation of thousands of predicted ORFs. The constantly growing database currently cumulates evidence from 87 ribosome profiling and 114 mass spectrometry studies from several species, tissues and cell lines. All data is freely available and downloadable from a web platform (www.openprot.org) supporting a genome browser and advanced queries for each species. Thus, OpenProt enables a more comprehensive landscape of eukaryotic genomes’ coding potential.
format Online
Article
Text
id pubmed-6323990
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63239902019-01-10 OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes Brunet, Marie A Brunelle, Mylène Lucier, Jean-François Delcourt, Vivian Levesque, Maxime Grenier, Frédéric Samandi, Sondos Leblanc, Sébastien Aguilar, Jean-David Dufour, Pascal Jacques, Jean-Francois Fournier, Isabelle Ouangraoua, Aida Scott, Michelle S Boisvert, François-Michel Roucou, Xavier Nucleic Acids Res Database Issue Advances in proteomics and sequencing have highlighted many non-annotated open reading frames (ORFs) in eukaryotic genomes. Genome annotations, cornerstones of today's research, mostly rely on protein prior knowledge and on ab initio prediction algorithms. Such algorithms notably enforce an arbitrary criterion of one coding sequence (CDS) per transcript, leading to a substantial underestimation of the coding potential of eukaryotes. Here, we present OpenProt, the first database fully endorsing a polycistronic model of eukaryotic genomes to date. OpenProt contains all possible ORFs longer than 30 codons across 10 species, and cumulates supporting evidence such as protein conservation, translation and expression. OpenProt annotates all known proteins (RefProts), novel predicted isoforms (Isoforms) and novel predicted proteins from alternative ORFs (AltProts). It incorporates cutting-edge algorithms to evaluate protein orthology and re-interrogate publicly available ribosome profiling and mass spectrometry datasets, supporting the annotation of thousands of predicted ORFs. The constantly growing database currently cumulates evidence from 87 ribosome profiling and 114 mass spectrometry studies from several species, tissues and cell lines. All data is freely available and downloadable from a web platform (www.openprot.org) supporting a genome browser and advanced queries for each species. Thus, OpenProt enables a more comprehensive landscape of eukaryotic genomes’ coding potential. Oxford University Press 2019-01-08 2018-10-09 /pmc/articles/PMC6323990/ /pubmed/30299502 http://dx.doi.org/10.1093/nar/gky936 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database Issue
Brunet, Marie A
Brunelle, Mylène
Lucier, Jean-François
Delcourt, Vivian
Levesque, Maxime
Grenier, Frédéric
Samandi, Sondos
Leblanc, Sébastien
Aguilar, Jean-David
Dufour, Pascal
Jacques, Jean-Francois
Fournier, Isabelle
Ouangraoua, Aida
Scott, Michelle S
Boisvert, François-Michel
Roucou, Xavier
OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes
title OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes
title_full OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes
title_fullStr OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes
title_full_unstemmed OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes
title_short OpenProt: a more comprehensive guide to explore eukaryotic coding potential and proteomes
title_sort openprot: a more comprehensive guide to explore eukaryotic coding potential and proteomes
topic Database Issue
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6323990/
https://www.ncbi.nlm.nih.gov/pubmed/30299502
http://dx.doi.org/10.1093/nar/gky936
work_keys_str_mv AT brunetmariea openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT brunellemylene openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT lucierjeanfrancois openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT delcourtvivian openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT levesquemaxime openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT grenierfrederic openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT samandisondos openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT leblancsebastien openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT aguilarjeandavid openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT dufourpascal openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT jacquesjeanfrancois openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT fournierisabelle openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT ouangraouaaida openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT scottmichelles openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT boisvertfrancoismichel openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes
AT roucouxavier openprotamorecomprehensiveguidetoexploreeukaryoticcodingpotentialandproteomes