Cargando…

Finding haplotypic signatures in proteins

BACKGROUND: The nonrandom distribution of alleles of common genomic variants produces haplotypes, which are fundamental in medical and population genetic studies. Consequently, protein-coding genes with different co-occurring sets of alleles can encode different amino acid sequences: protein haploty...

Descripción completa

Detalles Bibliográficos
Autores principales: Vašíček, Jakub, Skiadopoulou, Dafni, Kuznetsova, Ksenia G, Wen, Bo, Johansson, Stefan, Njølstad, Pål R, Bruckner, Stefan, Käll, Lukas, Vaudel, Marc
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10622322/
https://www.ncbi.nlm.nih.gov/pubmed/37919975
http://dx.doi.org/10.1093/gigascience/giad093
_version_ 1785130510729609216
author Vašíček, Jakub
Skiadopoulou, Dafni
Kuznetsova, Ksenia G
Wen, Bo
Johansson, Stefan
Njølstad, Pål R
Bruckner, Stefan
Käll, Lukas
Vaudel, Marc
author_facet Vašíček, Jakub
Skiadopoulou, Dafni
Kuznetsova, Ksenia G
Wen, Bo
Johansson, Stefan
Njølstad, Pål R
Bruckner, Stefan
Käll, Lukas
Vaudel, Marc
author_sort Vašíček, Jakub
collection PubMed
description BACKGROUND: The nonrandom distribution of alleles of common genomic variants produces haplotypes, which are fundamental in medical and population genetic studies. Consequently, protein-coding genes with different co-occurring sets of alleles can encode different amino acid sequences: protein haplotypes. These protein haplotypes are present in biological samples and detectable by mass spectrometry, but they are not accounted for in proteomic searches. Consequently, the impact of haplotypic variation on the results of proteomic searches and the discoverability of peptides specific to haplotypes remain unknown. FINDINGS: Here, we study how common genetic haplotypes influence the proteomic search space and investigate the possibility to match peptides containing multiple amino acid substitutions to a publicly available data set of mass spectra. We found that for 12.42% of the discoverable amino acid substitutions encoded by common haplotypes, 2 or more substitutions may co-occur in the same peptide after tryptic digestion of the protein haplotypes. We identified 352 spectra that matched to such multivariant peptides, and out of the 4,582 amino acid substitutions identified, 6.37% were covered by multivariant peptides. However, the evaluation of the reliability of these matches remains challenging, suggesting that refined error rate estimation procedures are needed for such complex proteomic searches. CONCLUSIONS: As these procedures become available and the ability to analyze protein haplotypes increases, we anticipate that proteomics will provide new information on the consequences of common variation, across tissues and time.
format Online
Article
Text
id pubmed-10622322
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-106223222023-11-04 Finding haplotypic signatures in proteins Vašíček, Jakub Skiadopoulou, Dafni Kuznetsova, Ksenia G Wen, Bo Johansson, Stefan Njølstad, Pål R Bruckner, Stefan Käll, Lukas Vaudel, Marc Gigascience Research BACKGROUND: The nonrandom distribution of alleles of common genomic variants produces haplotypes, which are fundamental in medical and population genetic studies. Consequently, protein-coding genes with different co-occurring sets of alleles can encode different amino acid sequences: protein haplotypes. These protein haplotypes are present in biological samples and detectable by mass spectrometry, but they are not accounted for in proteomic searches. Consequently, the impact of haplotypic variation on the results of proteomic searches and the discoverability of peptides specific to haplotypes remain unknown. FINDINGS: Here, we study how common genetic haplotypes influence the proteomic search space and investigate the possibility to match peptides containing multiple amino acid substitutions to a publicly available data set of mass spectra. We found that for 12.42% of the discoverable amino acid substitutions encoded by common haplotypes, 2 or more substitutions may co-occur in the same peptide after tryptic digestion of the protein haplotypes. We identified 352 spectra that matched to such multivariant peptides, and out of the 4,582 amino acid substitutions identified, 6.37% were covered by multivariant peptides. However, the evaluation of the reliability of these matches remains challenging, suggesting that refined error rate estimation procedures are needed for such complex proteomic searches. CONCLUSIONS: As these procedures become available and the ability to analyze protein haplotypes increases, we anticipate that proteomics will provide new information on the consequences of common variation, across tissues and time. Oxford University Press 2023-10-31 /pmc/articles/PMC10622322/ /pubmed/37919975 http://dx.doi.org/10.1093/gigascience/giad093 Text en © The Author(s) 2023. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Vašíček, Jakub
Skiadopoulou, Dafni
Kuznetsova, Ksenia G
Wen, Bo
Johansson, Stefan
Njølstad, Pål R
Bruckner, Stefan
Käll, Lukas
Vaudel, Marc
Finding haplotypic signatures in proteins
title Finding haplotypic signatures in proteins
title_full Finding haplotypic signatures in proteins
title_fullStr Finding haplotypic signatures in proteins
title_full_unstemmed Finding haplotypic signatures in proteins
title_short Finding haplotypic signatures in proteins
title_sort finding haplotypic signatures in proteins
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10622322/
https://www.ncbi.nlm.nih.gov/pubmed/37919975
http://dx.doi.org/10.1093/gigascience/giad093
work_keys_str_mv AT vasicekjakub findinghaplotypicsignaturesinproteins
AT skiadopouloudafni findinghaplotypicsignaturesinproteins
AT kuznetsovakseniag findinghaplotypicsignaturesinproteins
AT wenbo findinghaplotypicsignaturesinproteins
AT johanssonstefan findinghaplotypicsignaturesinproteins
AT njølstadpalr findinghaplotypicsignaturesinproteins
AT brucknerstefan findinghaplotypicsignaturesinproteins
AT kalllukas findinghaplotypicsignaturesinproteins
AT vaudelmarc findinghaplotypicsignaturesinproteins