Cargando…

Targeted Feature Detection for Data-Dependent Shotgun Proteomics

[Image: see text] Label-free quantification of shotgun LC–MS/MS data is the prevailing approach in quantitative proteomics but remains computationally nontrivial. The central data analysis step is the detection of peptide-specific signal patterns, called features. Peptide quantification is facilitat...

Descripción completa

Detalles Bibliográficos
Autores principales:	Weisser, Hendrik, Choudhary, Jyoti S.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	American Chemical Society 2017
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5547443/ https://www.ncbi.nlm.nih.gov/pubmed/28673088 http://dx.doi.org/10.1021/acs.jproteome.7b00248

_version_	1783255690530258944
author	Weisser, Hendrik Choudhary, Jyoti S.
author_facet	Weisser, Hendrik Choudhary, Jyoti S.
author_sort	Weisser, Hendrik
collection	PubMed
description	[Image: see text] Label-free quantification of shotgun LC–MS/MS data is the prevailing approach in quantitative proteomics but remains computationally nontrivial. The central data analysis step is the detection of peptide-specific signal patterns, called features. Peptide quantification is facilitated by associating signal intensities in features with peptide sequences derived from MS2 spectra; however, missing values due to imperfect feature detection are a common problem. A feature detection approach that directly targets identified peptides (minimizing missing values) but also offers robustness against false-positive features (by assigning meaningful confidence scores) would thus be highly desirable. We developed a new feature detection algorithm within the OpenMS software framework, leveraging ideas and algorithms from the OpenSWATH toolset for DIA/SRM data analysis. Our software, FeatureFinderIdentification (“FFId”), implements a targeted approach to feature detection based on information from identified peptides. This information is encoded in an MS1 assay library, based on which ion chromatogram extraction and detection of feature candidates are carried out. Significantly, when analyzing data from experiments comprising multiple samples, our approach distinguishes between “internal” and “external” (inferred) peptide identifications (IDs) for each sample. On the basis of internal IDs, two sets of positive (true) and negative (decoy) feature candidates are defined. A support vector machine (SVM) classifier is then trained to discriminate between the sets and is subsequently applied to the “uncertain” feature candidates from external IDs, facilitating selection and confidence scoring of the best feature candidate for each peptide. This approach also enables our algorithm to estimate the false discovery rate (FDR) of the feature selection step. We validated FFId based on a public benchmark data set, comprising a yeast cell lysate spiked with protein standards that provide a known ground-truth. The algorithm reached almost complete (>99%) quantification coverage for the full set of peptides identified at 1% FDR (PSM level). Compared with other software solutions for label-free quantification, this is an outstanding result, which was achieved at competitive quantification accuracy and reproducibility across replicates. The FDR for the feature selection was estimated at a low 1.5% on average per sample (3% for features inferred from external peptide IDs). The FFId software is open-source and freely available as part of OpenMS (www.openms.org).
format	Online Article Text
id	pubmed-5547443
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	American Chemical Society
record_format	MEDLINE/PubMed
spelling	pubmed-55474432017-08-09 Targeted Feature Detection for Data-Dependent Shotgun Proteomics Weisser, Hendrik Choudhary, Jyoti S. J Proteome Res [Image: see text] Label-free quantification of shotgun LC–MS/MS data is the prevailing approach in quantitative proteomics but remains computationally nontrivial. The central data analysis step is the detection of peptide-specific signal patterns, called features. Peptide quantification is facilitated by associating signal intensities in features with peptide sequences derived from MS2 spectra; however, missing values due to imperfect feature detection are a common problem. A feature detection approach that directly targets identified peptides (minimizing missing values) but also offers robustness against false-positive features (by assigning meaningful confidence scores) would thus be highly desirable. We developed a new feature detection algorithm within the OpenMS software framework, leveraging ideas and algorithms from the OpenSWATH toolset for DIA/SRM data analysis. Our software, FeatureFinderIdentification (“FFId”), implements a targeted approach to feature detection based on information from identified peptides. This information is encoded in an MS1 assay library, based on which ion chromatogram extraction and detection of feature candidates are carried out. Significantly, when analyzing data from experiments comprising multiple samples, our approach distinguishes between “internal” and “external” (inferred) peptide identifications (IDs) for each sample. On the basis of internal IDs, two sets of positive (true) and negative (decoy) feature candidates are defined. A support vector machine (SVM) classifier is then trained to discriminate between the sets and is subsequently applied to the “uncertain” feature candidates from external IDs, facilitating selection and confidence scoring of the best feature candidate for each peptide. This approach also enables our algorithm to estimate the false discovery rate (FDR) of the feature selection step. We validated FFId based on a public benchmark data set, comprising a yeast cell lysate spiked with protein standards that provide a known ground-truth. The algorithm reached almost complete (>99%) quantification coverage for the full set of peptides identified at 1% FDR (PSM level). Compared with other software solutions for label-free quantification, this is an outstanding result, which was achieved at competitive quantification accuracy and reproducibility across replicates. The FDR for the feature selection was estimated at a low 1.5% on average per sample (3% for features inferred from external peptide IDs). The FFId software is open-source and freely available as part of OpenMS (www.openms.org). American Chemical Society 2017-07-04 2017-08-04 /pmc/articles/PMC5547443/ /pubmed/28673088 http://dx.doi.org/10.1021/acs.jproteome.7b00248 Text en Copyright © 2017 American Chemical Society This is an open access article published under a Creative Commons Attribution (CC-BY) License (http://pubs.acs.org/page/policy/authorchoice_ccby_termsofuse.html) , which permits unrestricted use, distribution and reproduction in any medium, provided the author and source are cited.
spellingShingle	Weisser, Hendrik Choudhary, Jyoti S. Targeted Feature Detection for Data-Dependent Shotgun Proteomics
title	Targeted Feature Detection for Data-Dependent Shotgun Proteomics
title_full	Targeted Feature Detection for Data-Dependent Shotgun Proteomics
title_fullStr	Targeted Feature Detection for Data-Dependent Shotgun Proteomics
title_full_unstemmed	Targeted Feature Detection for Data-Dependent Shotgun Proteomics
title_short	Targeted Feature Detection for Data-Dependent Shotgun Proteomics
title_sort	targeted feature detection for data-dependent shotgun proteomics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5547443/ https://www.ncbi.nlm.nih.gov/pubmed/28673088 http://dx.doi.org/10.1021/acs.jproteome.7b00248
work_keys_str_mv	AT weisserhendrik targetedfeaturedetectionfordatadependentshotgunproteomics AT choudharyjyotis targetedfeaturedetectionfordatadependentshotgunproteomics

Targeted Feature Detection for Data-Dependent Shotgun Proteomics

Ejemplares similares