Cargando…

How to talk about protein‐level false discovery rates in shotgun proteomics

A frequently sought output from a shotgun proteomics experiment is a list of proteins that we believe to have been present in the analyzed sample before proteolytic digestion. The standard technique to control for errors in such lists is to enforce a preset threshold for the false discovery rate (FD...

Descripción completa

Detalles Bibliográficos
Autores principales: The, Matthew, Tasnim, Ayesha, Käll, Lukas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5096025/
https://www.ncbi.nlm.nih.gov/pubmed/27503675
http://dx.doi.org/10.1002/pmic.201500431
_version_ 1782465397394505728
author The, Matthew
Tasnim, Ayesha
Käll, Lukas
author_facet The, Matthew
Tasnim, Ayesha
Käll, Lukas
author_sort The, Matthew
collection PubMed
description A frequently sought output from a shotgun proteomics experiment is a list of proteins that we believe to have been present in the analyzed sample before proteolytic digestion. The standard technique to control for errors in such lists is to enforce a preset threshold for the false discovery rate (FDR). Many consider protein‐level FDRs a difficult and vague concept, as the measurement entities, spectra, are manifestations of peptides and not proteins. Here, we argue that this confusion is unnecessary and provide a framework on how to think about protein‐level FDRs, starting from its basic principle: the null hypothesis. Specifically, we point out that two competing null hypotheses are used concurrently in today's protein inference methods, which has gone unnoticed by many. Using simulations of a shotgun proteomics experiment, we show how confusing one null hypothesis for the other can lead to serious discrepancies in the FDR. Furthermore, we demonstrate how the same simulations can be used to verify FDR estimates of protein inference methods. In particular, we show that, for a simple protein inference method, decoy models can be used to accurately estimate protein‐level FDRs for both competing null hypotheses.
format Online
Article
Text
id pubmed-5096025
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-50960252016-11-09 How to talk about protein‐level false discovery rates in shotgun proteomics The, Matthew Tasnim, Ayesha Käll, Lukas Proteomics Research Articles A frequently sought output from a shotgun proteomics experiment is a list of proteins that we believe to have been present in the analyzed sample before proteolytic digestion. The standard technique to control for errors in such lists is to enforce a preset threshold for the false discovery rate (FDR). Many consider protein‐level FDRs a difficult and vague concept, as the measurement entities, spectra, are manifestations of peptides and not proteins. Here, we argue that this confusion is unnecessary and provide a framework on how to think about protein‐level FDRs, starting from its basic principle: the null hypothesis. Specifically, we point out that two competing null hypotheses are used concurrently in today's protein inference methods, which has gone unnoticed by many. Using simulations of a shotgun proteomics experiment, we show how confusing one null hypothesis for the other can lead to serious discrepancies in the FDR. Furthermore, we demonstrate how the same simulations can be used to verify FDR estimates of protein inference methods. In particular, we show that, for a simple protein inference method, decoy models can be used to accurately estimate protein‐level FDRs for both competing null hypotheses. John Wiley and Sons Inc. 2016-09-19 2016-09 /pmc/articles/PMC5096025/ /pubmed/27503675 http://dx.doi.org/10.1002/pmic.201500431 Text en © 2016 The Authors. Proteomics Published by Wiley‐VCH Verlag GmbH & Co. KGaA, Weinheim. This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial (http://creativecommons.org/licenses/by-nc/4.0) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Research Articles
The, Matthew
Tasnim, Ayesha
Käll, Lukas
How to talk about protein‐level false discovery rates in shotgun proteomics
title How to talk about protein‐level false discovery rates in shotgun proteomics
title_full How to talk about protein‐level false discovery rates in shotgun proteomics
title_fullStr How to talk about protein‐level false discovery rates in shotgun proteomics
title_full_unstemmed How to talk about protein‐level false discovery rates in shotgun proteomics
title_short How to talk about protein‐level false discovery rates in shotgun proteomics
title_sort how to talk about protein‐level false discovery rates in shotgun proteomics
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5096025/
https://www.ncbi.nlm.nih.gov/pubmed/27503675
http://dx.doi.org/10.1002/pmic.201500431
work_keys_str_mv AT thematthew howtotalkaboutproteinlevelfalsediscoveryratesinshotgunproteomics
AT tasnimayesha howtotalkaboutproteinlevelfalsediscoveryratesinshotgunproteomics
AT kalllukas howtotalkaboutproteinlevelfalsediscoveryratesinshotgunproteomics