Cargando…

The Choice of Search Engine Affects Sequencing Depth and HLA Class I Allele-Specific Peptide Repertoires

Standardization of immunopeptidomics experiments across laboratories is a pressing issue within the field, and currently a variety of different methods for sample preparation and data analysis tools are applied. Here, we compared different software packages to interrogate immunopeptidomics datasets...

Descripción completa

Detalles Bibliográficos
Autores principales: Parker, Robert, Tailor, Arun, Peng, Xu, Nicastri, Annalisa, Zerweck, Johannes, Reimer, Ulf, Wenschuh, Holger, Schnatbaum, Karsten, Ternette, Nicola
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Biochemistry and Molecular Biology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8724928/
https://www.ncbi.nlm.nih.gov/pubmed/34303857
http://dx.doi.org/10.1016/j.mcpro.2021.100124
_version_ 1784626008225218560
author Parker, Robert
Tailor, Arun
Peng, Xu
Nicastri, Annalisa
Zerweck, Johannes
Reimer, Ulf
Wenschuh, Holger
Schnatbaum, Karsten
Ternette, Nicola
author_facet Parker, Robert
Tailor, Arun
Peng, Xu
Nicastri, Annalisa
Zerweck, Johannes
Reimer, Ulf
Wenschuh, Holger
Schnatbaum, Karsten
Ternette, Nicola
author_sort Parker, Robert
collection PubMed
description Standardization of immunopeptidomics experiments across laboratories is a pressing issue within the field, and currently a variety of different methods for sample preparation and data analysis tools are applied. Here, we compared different software packages to interrogate immunopeptidomics datasets and found that Peaks reproducibly reports substantially more peptide sequences (~30–70%) compared with Maxquant, Comet, and MS-GF+ at a global false discovery rate (FDR) of <1%. We noted that these differences are driven by search space and spectral ranking. Furthermore, we observed differences in the proportion of peptides binding the human leukocyte antigen (HLA) alleles present in the samples, indicating that sequence-related differences affected the performance of each tested engine. Utilizing data from single HLA allele expressing cell lines, we observed significant differences in amino acid frequency among the peptides reported, with a broadly higher representation of hydrophobic amino acids L, I, P, and V reported by Peaks. We validated these results using data generated with a synthetic library of 2000 HLA-associated peptides from four common HLA alleles with distinct anchor residues. Our investigation highlights that search engines create a bias in peptide sequence depth and peptide amino acid composition, and resulting data should be interpreted with caution.
format Online
Article
Text
id pubmed-8724928
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Society for Biochemistry and Molecular Biology
record_format MEDLINE/PubMed
spelling pubmed-87249282022-01-11 The Choice of Search Engine Affects Sequencing Depth and HLA Class I Allele-Specific Peptide Repertoires Parker, Robert Tailor, Arun Peng, Xu Nicastri, Annalisa Zerweck, Johannes Reimer, Ulf Wenschuh, Holger Schnatbaum, Karsten Ternette, Nicola Mol Cell Proteomics Research Standardization of immunopeptidomics experiments across laboratories is a pressing issue within the field, and currently a variety of different methods for sample preparation and data analysis tools are applied. Here, we compared different software packages to interrogate immunopeptidomics datasets and found that Peaks reproducibly reports substantially more peptide sequences (~30–70%) compared with Maxquant, Comet, and MS-GF+ at a global false discovery rate (FDR) of <1%. We noted that these differences are driven by search space and spectral ranking. Furthermore, we observed differences in the proportion of peptides binding the human leukocyte antigen (HLA) alleles present in the samples, indicating that sequence-related differences affected the performance of each tested engine. Utilizing data from single HLA allele expressing cell lines, we observed significant differences in amino acid frequency among the peptides reported, with a broadly higher representation of hydrophobic amino acids L, I, P, and V reported by Peaks. We validated these results using data generated with a synthetic library of 2000 HLA-associated peptides from four common HLA alleles with distinct anchor residues. Our investigation highlights that search engines create a bias in peptide sequence depth and peptide amino acid composition, and resulting data should be interpreted with caution. American Society for Biochemistry and Molecular Biology 2021-07-23 /pmc/articles/PMC8724928/ /pubmed/34303857 http://dx.doi.org/10.1016/j.mcpro.2021.100124 Text en © 2021 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Research
Parker, Robert
Tailor, Arun
Peng, Xu
Nicastri, Annalisa
Zerweck, Johannes
Reimer, Ulf
Wenschuh, Holger
Schnatbaum, Karsten
Ternette, Nicola
The Choice of Search Engine Affects Sequencing Depth and HLA Class I Allele-Specific Peptide Repertoires
title The Choice of Search Engine Affects Sequencing Depth and HLA Class I Allele-Specific Peptide Repertoires
title_full The Choice of Search Engine Affects Sequencing Depth and HLA Class I Allele-Specific Peptide Repertoires
title_fullStr The Choice of Search Engine Affects Sequencing Depth and HLA Class I Allele-Specific Peptide Repertoires
title_full_unstemmed The Choice of Search Engine Affects Sequencing Depth and HLA Class I Allele-Specific Peptide Repertoires
title_short The Choice of Search Engine Affects Sequencing Depth and HLA Class I Allele-Specific Peptide Repertoires
title_sort choice of search engine affects sequencing depth and hla class i allele-specific peptide repertoires
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8724928/
https://www.ncbi.nlm.nih.gov/pubmed/34303857
http://dx.doi.org/10.1016/j.mcpro.2021.100124
work_keys_str_mv AT parkerrobert thechoiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT tailorarun thechoiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT pengxu thechoiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT nicastriannalisa thechoiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT zerweckjohannes thechoiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT reimerulf thechoiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT wenschuhholger thechoiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT schnatbaumkarsten thechoiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT ternettenicola thechoiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT parkerrobert choiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT tailorarun choiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT pengxu choiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT nicastriannalisa choiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT zerweckjohannes choiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT reimerulf choiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT wenschuhholger choiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT schnatbaumkarsten choiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires
AT ternettenicola choiceofsearchengineaffectssequencingdepthandhlaclassiallelespecificpeptiderepertoires