Cargando…
Finding haplotypic signatures in proteins
BACKGROUND: The nonrandom distribution of alleles of common genomic variants produces haplotypes, which are fundamental in medical and population genetic studies. Consequently, protein-coding genes with different co-occurring sets of alleles can encode different amino acid sequences: protein haploty...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10622322/ https://www.ncbi.nlm.nih.gov/pubmed/37919975 http://dx.doi.org/10.1093/gigascience/giad093 |
_version_ | 1785130510729609216 |
---|---|
author | Vašíček, Jakub Skiadopoulou, Dafni Kuznetsova, Ksenia G Wen, Bo Johansson, Stefan Njølstad, Pål R Bruckner, Stefan Käll, Lukas Vaudel, Marc |
author_facet | Vašíček, Jakub Skiadopoulou, Dafni Kuznetsova, Ksenia G Wen, Bo Johansson, Stefan Njølstad, Pål R Bruckner, Stefan Käll, Lukas Vaudel, Marc |
author_sort | Vašíček, Jakub |
collection | PubMed |
description | BACKGROUND: The nonrandom distribution of alleles of common genomic variants produces haplotypes, which are fundamental in medical and population genetic studies. Consequently, protein-coding genes with different co-occurring sets of alleles can encode different amino acid sequences: protein haplotypes. These protein haplotypes are present in biological samples and detectable by mass spectrometry, but they are not accounted for in proteomic searches. Consequently, the impact of haplotypic variation on the results of proteomic searches and the discoverability of peptides specific to haplotypes remain unknown. FINDINGS: Here, we study how common genetic haplotypes influence the proteomic search space and investigate the possibility to match peptides containing multiple amino acid substitutions to a publicly available data set of mass spectra. We found that for 12.42% of the discoverable amino acid substitutions encoded by common haplotypes, 2 or more substitutions may co-occur in the same peptide after tryptic digestion of the protein haplotypes. We identified 352 spectra that matched to such multivariant peptides, and out of the 4,582 amino acid substitutions identified, 6.37% were covered by multivariant peptides. However, the evaluation of the reliability of these matches remains challenging, suggesting that refined error rate estimation procedures are needed for such complex proteomic searches. CONCLUSIONS: As these procedures become available and the ability to analyze protein haplotypes increases, we anticipate that proteomics will provide new information on the consequences of common variation, across tissues and time. |
format | Online Article Text |
id | pubmed-10622322 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-106223222023-11-04 Finding haplotypic signatures in proteins Vašíček, Jakub Skiadopoulou, Dafni Kuznetsova, Ksenia G Wen, Bo Johansson, Stefan Njølstad, Pål R Bruckner, Stefan Käll, Lukas Vaudel, Marc Gigascience Research BACKGROUND: The nonrandom distribution of alleles of common genomic variants produces haplotypes, which are fundamental in medical and population genetic studies. Consequently, protein-coding genes with different co-occurring sets of alleles can encode different amino acid sequences: protein haplotypes. These protein haplotypes are present in biological samples and detectable by mass spectrometry, but they are not accounted for in proteomic searches. Consequently, the impact of haplotypic variation on the results of proteomic searches and the discoverability of peptides specific to haplotypes remain unknown. FINDINGS: Here, we study how common genetic haplotypes influence the proteomic search space and investigate the possibility to match peptides containing multiple amino acid substitutions to a publicly available data set of mass spectra. We found that for 12.42% of the discoverable amino acid substitutions encoded by common haplotypes, 2 or more substitutions may co-occur in the same peptide after tryptic digestion of the protein haplotypes. We identified 352 spectra that matched to such multivariant peptides, and out of the 4,582 amino acid substitutions identified, 6.37% were covered by multivariant peptides. However, the evaluation of the reliability of these matches remains challenging, suggesting that refined error rate estimation procedures are needed for such complex proteomic searches. CONCLUSIONS: As these procedures become available and the ability to analyze protein haplotypes increases, we anticipate that proteomics will provide new information on the consequences of common variation, across tissues and time. Oxford University Press 2023-10-31 /pmc/articles/PMC10622322/ /pubmed/37919975 http://dx.doi.org/10.1093/gigascience/giad093 Text en © The Author(s) 2023. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Vašíček, Jakub Skiadopoulou, Dafni Kuznetsova, Ksenia G Wen, Bo Johansson, Stefan Njølstad, Pål R Bruckner, Stefan Käll, Lukas Vaudel, Marc Finding haplotypic signatures in proteins |
title | Finding haplotypic signatures in proteins |
title_full | Finding haplotypic signatures in proteins |
title_fullStr | Finding haplotypic signatures in proteins |
title_full_unstemmed | Finding haplotypic signatures in proteins |
title_short | Finding haplotypic signatures in proteins |
title_sort | finding haplotypic signatures in proteins |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10622322/ https://www.ncbi.nlm.nih.gov/pubmed/37919975 http://dx.doi.org/10.1093/gigascience/giad093 |
work_keys_str_mv | AT vasicekjakub findinghaplotypicsignaturesinproteins AT skiadopouloudafni findinghaplotypicsignaturesinproteins AT kuznetsovakseniag findinghaplotypicsignaturesinproteins AT wenbo findinghaplotypicsignaturesinproteins AT johanssonstefan findinghaplotypicsignaturesinproteins AT njølstadpalr findinghaplotypicsignaturesinproteins AT brucknerstefan findinghaplotypicsignaturesinproteins AT kalllukas findinghaplotypicsignaturesinproteins AT vaudelmarc findinghaplotypicsignaturesinproteins |