Cargando…
OpenCustomDB: Integration of Unannotated Open Reading Frames and Genetic Variants to Generate More Comprehensive Customized Protein Databases
[Image: see text] Proteomic diversity in biological samples can be characterized by mass spectrometry (MS)-based proteomics using customized protein databases generated from sets of transcripts previously detected by RNA-seq. This diversity has only been increased by the recent discovery that many t...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2023
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10167680/ https://www.ncbi.nlm.nih.gov/pubmed/36961377 http://dx.doi.org/10.1021/acs.jproteome.3c00054 |
_version_ | 1785038723368353792 |
---|---|
author | Guilloy, Noé Brunet, Marie A. Leblanc, Sébastien Jacques, Jean-François Hardy, Marie-Pierre Ehx, Grégory Lanoix, Joël Thibault, Pierre Perreault, Claude Roucou, Xavier |
author_facet | Guilloy, Noé Brunet, Marie A. Leblanc, Sébastien Jacques, Jean-François Hardy, Marie-Pierre Ehx, Grégory Lanoix, Joël Thibault, Pierre Perreault, Claude Roucou, Xavier |
author_sort | Guilloy, Noé |
collection | PubMed |
description | [Image: see text] Proteomic diversity in biological samples can be characterized by mass spectrometry (MS)-based proteomics using customized protein databases generated from sets of transcripts previously detected by RNA-seq. This diversity has only been increased by the recent discovery that many translated alternative open reading frames rest unannotated at unsuspected locations of mRNAs and ncRNAs. These novel protein products, termed alternative proteins, have been left out of all previous custom database generation tools. Consequently, genetic variations that impact alternative open reading frames and variant peptides from their translated proteins are not detectable with current computational workflows. To fill this gap, we present OpenCustomDB, a bioinformatics tool that uses sample-specific RNaseq data to identify genomic variants in canonical and alternative open reading frames, allowing for more than one coding region per transcript. In a test reanalysis of a cohort of 16 patients with acute myeloid leukemia, 5666 peptides from alternative proteins were detected, including 201 variant peptides. We also observed that a significant fraction of peptide-spectrum matches previously assigned to peptides from canonical proteins got better scores when reassigned to peptides from alternative proteins. Custom protein libraries that include sample-specific sequence variations of all possible open reading frames are promising contributions to the development of proteomics and precision medicine. The raw and processed proteomics data presented in this study can be found in PRIDE repository with accession number PXD029240. |
format | Online Article Text |
id | pubmed-10167680 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-101676802023-05-10 OpenCustomDB: Integration of Unannotated Open Reading Frames and Genetic Variants to Generate More Comprehensive Customized Protein Databases Guilloy, Noé Brunet, Marie A. Leblanc, Sébastien Jacques, Jean-François Hardy, Marie-Pierre Ehx, Grégory Lanoix, Joël Thibault, Pierre Perreault, Claude Roucou, Xavier J Proteome Res [Image: see text] Proteomic diversity in biological samples can be characterized by mass spectrometry (MS)-based proteomics using customized protein databases generated from sets of transcripts previously detected by RNA-seq. This diversity has only been increased by the recent discovery that many translated alternative open reading frames rest unannotated at unsuspected locations of mRNAs and ncRNAs. These novel protein products, termed alternative proteins, have been left out of all previous custom database generation tools. Consequently, genetic variations that impact alternative open reading frames and variant peptides from their translated proteins are not detectable with current computational workflows. To fill this gap, we present OpenCustomDB, a bioinformatics tool that uses sample-specific RNaseq data to identify genomic variants in canonical and alternative open reading frames, allowing for more than one coding region per transcript. In a test reanalysis of a cohort of 16 patients with acute myeloid leukemia, 5666 peptides from alternative proteins were detected, including 201 variant peptides. We also observed that a significant fraction of peptide-spectrum matches previously assigned to peptides from canonical proteins got better scores when reassigned to peptides from alternative proteins. Custom protein libraries that include sample-specific sequence variations of all possible open reading frames are promising contributions to the development of proteomics and precision medicine. The raw and processed proteomics data presented in this study can be found in PRIDE repository with accession number PXD029240. American Chemical Society 2023-03-24 /pmc/articles/PMC10167680/ /pubmed/36961377 http://dx.doi.org/10.1021/acs.jproteome.3c00054 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Guilloy, Noé Brunet, Marie A. Leblanc, Sébastien Jacques, Jean-François Hardy, Marie-Pierre Ehx, Grégory Lanoix, Joël Thibault, Pierre Perreault, Claude Roucou, Xavier OpenCustomDB: Integration of Unannotated Open Reading Frames and Genetic Variants to Generate More Comprehensive Customized Protein Databases |
title | OpenCustomDB:
Integration of Unannotated Open Reading
Frames and Genetic Variants to Generate More Comprehensive Customized
Protein Databases |
title_full | OpenCustomDB:
Integration of Unannotated Open Reading
Frames and Genetic Variants to Generate More Comprehensive Customized
Protein Databases |
title_fullStr | OpenCustomDB:
Integration of Unannotated Open Reading
Frames and Genetic Variants to Generate More Comprehensive Customized
Protein Databases |
title_full_unstemmed | OpenCustomDB:
Integration of Unannotated Open Reading
Frames and Genetic Variants to Generate More Comprehensive Customized
Protein Databases |
title_short | OpenCustomDB:
Integration of Unannotated Open Reading
Frames and Genetic Variants to Generate More Comprehensive Customized
Protein Databases |
title_sort | opencustomdb:
integration of unannotated open reading
frames and genetic variants to generate more comprehensive customized
protein databases |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10167680/ https://www.ncbi.nlm.nih.gov/pubmed/36961377 http://dx.doi.org/10.1021/acs.jproteome.3c00054 |
work_keys_str_mv | AT guilloynoe opencustomdbintegrationofunannotatedopenreadingframesandgeneticvariantstogeneratemorecomprehensivecustomizedproteindatabases AT brunetmariea opencustomdbintegrationofunannotatedopenreadingframesandgeneticvariantstogeneratemorecomprehensivecustomizedproteindatabases AT leblancsebastien opencustomdbintegrationofunannotatedopenreadingframesandgeneticvariantstogeneratemorecomprehensivecustomizedproteindatabases AT jacquesjeanfrancois opencustomdbintegrationofunannotatedopenreadingframesandgeneticvariantstogeneratemorecomprehensivecustomizedproteindatabases AT hardymariepierre opencustomdbintegrationofunannotatedopenreadingframesandgeneticvariantstogeneratemorecomprehensivecustomizedproteindatabases AT ehxgregory opencustomdbintegrationofunannotatedopenreadingframesandgeneticvariantstogeneratemorecomprehensivecustomizedproteindatabases AT lanoixjoel opencustomdbintegrationofunannotatedopenreadingframesandgeneticvariantstogeneratemorecomprehensivecustomizedproteindatabases AT thibaultpierre opencustomdbintegrationofunannotatedopenreadingframesandgeneticvariantstogeneratemorecomprehensivecustomizedproteindatabases AT perreaultclaude opencustomdbintegrationofunannotatedopenreadingframesandgeneticvariantstogeneratemorecomprehensivecustomizedproteindatabases AT roucouxavier opencustomdbintegrationofunannotatedopenreadingframesandgeneticvariantstogeneratemorecomprehensivecustomizedproteindatabases |