Cargando…

An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters

BACKGROUND: Many studies have provided algorithms or methods to assess a statistical significance in quantitative proteomics when multiple replicates for a protein sample and a LC/MS analysis are available. But, confidence is still lacking in using datasets for a biological interpretation without pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Qingbo, Roxas, Bryan AP
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2645366/
https://www.ncbi.nlm.nih.gov/pubmed/19187558
http://dx.doi.org/10.1186/1471-2105-10-43
_version_ 1782164774096732160
author Li, Qingbo
Roxas, Bryan AP
author_facet Li, Qingbo
Roxas, Bryan AP
author_sort Li, Qingbo
collection PubMed
description BACKGROUND: Many studies have provided algorithms or methods to assess a statistical significance in quantitative proteomics when multiple replicates for a protein sample and a LC/MS analysis are available. But, confidence is still lacking in using datasets for a biological interpretation without protein sample replicates. Although a fold-change is a conventional threshold that can be used when there are no sample replicates, it does not provide an assessment of statistical significance such as a false discovery rate (FDR) which is an important indicator of the reliability to identify differentially expressed proteins. In this work, we investigate whether differentially expressed proteins can be detected with a statistical significance from a pair of unlabeled protein samples without replicates and with only duplicate LC/MS injections per sample. A FDR is used to gauge the statistical significance of the differentially expressed proteins. RESULTS: We have experimented to operate on several parameters to control a FDR, including a fold-change, a statistical test, and a minimum number of permuted significant pairings. Although none of these parameters alone gives a satisfactory control of a FDR, we find that a combination of these parameters provides a very effective means to control a FDR without compromising the sensitivity. The results suggest that it is possible to perform a significance analysis without protein sample replicates. Only duplicate LC/MS injections per sample are needed. We illustrate that differentially expressed proteins can be detected with a FDR between 0 and 15% at a positive rate of 4–16%. The method is evaluated for its sensitivity and specificity by a ROC analysis, and is further validated with a [(15)N]-labeled internal-standard protein sample and additional unlabeled protein sample replicates. CONCLUSION: We demonstrate that a statistical significance can be inferred without protein sample replicates in label-free quantitative proteomics. The approach described in this study would be useful in many exploratory experiments where a sample amount or instrument time is limited. Naturally, this method is also suitable for proteomics experiments where multiple sample replicates are available. It is simple, and is complementary to other more sophisticated algorithms that are not designed for dealing with a small number of sample replicates.
format Text
id pubmed-2645366
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26453662009-02-20 An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters Li, Qingbo Roxas, Bryan AP BMC Bioinformatics Methodology Article BACKGROUND: Many studies have provided algorithms or methods to assess a statistical significance in quantitative proteomics when multiple replicates for a protein sample and a LC/MS analysis are available. But, confidence is still lacking in using datasets for a biological interpretation without protein sample replicates. Although a fold-change is a conventional threshold that can be used when there are no sample replicates, it does not provide an assessment of statistical significance such as a false discovery rate (FDR) which is an important indicator of the reliability to identify differentially expressed proteins. In this work, we investigate whether differentially expressed proteins can be detected with a statistical significance from a pair of unlabeled protein samples without replicates and with only duplicate LC/MS injections per sample. A FDR is used to gauge the statistical significance of the differentially expressed proteins. RESULTS: We have experimented to operate on several parameters to control a FDR, including a fold-change, a statistical test, and a minimum number of permuted significant pairings. Although none of these parameters alone gives a satisfactory control of a FDR, we find that a combination of these parameters provides a very effective means to control a FDR without compromising the sensitivity. The results suggest that it is possible to perform a significance analysis without protein sample replicates. Only duplicate LC/MS injections per sample are needed. We illustrate that differentially expressed proteins can be detected with a FDR between 0 and 15% at a positive rate of 4–16%. The method is evaluated for its sensitivity and specificity by a ROC analysis, and is further validated with a [(15)N]-labeled internal-standard protein sample and additional unlabeled protein sample replicates. CONCLUSION: We demonstrate that a statistical significance can be inferred without protein sample replicates in label-free quantitative proteomics. The approach described in this study would be useful in many exploratory experiments where a sample amount or instrument time is limited. Naturally, this method is also suitable for proteomics experiments where multiple sample replicates are available. It is simple, and is complementary to other more sophisticated algorithms that are not designed for dealing with a small number of sample replicates. BioMed Central 2009-02-02 /pmc/articles/PMC2645366/ /pubmed/19187558 http://dx.doi.org/10.1186/1471-2105-10-43 Text en Copyright © 2009 Li and Roxas; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Li, Qingbo
Roxas, Bryan AP
An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters
title An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters
title_full An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters
title_fullStr An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters
title_full_unstemmed An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters
title_short An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters
title_sort assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2645366/
https://www.ncbi.nlm.nih.gov/pubmed/19187558
http://dx.doi.org/10.1186/1471-2105-10-43
work_keys_str_mv AT liqingbo anassessmentoffalsediscoveryratesandstatisticalsignificanceinlabelfreequantitativeproteomicswithcombinedfilters
AT roxasbryanap anassessmentoffalsediscoveryratesandstatisticalsignificanceinlabelfreequantitativeproteomicswithcombinedfilters
AT liqingbo assessmentoffalsediscoveryratesandstatisticalsignificanceinlabelfreequantitativeproteomicswithcombinedfilters
AT roxasbryanap assessmentoffalsediscoveryratesandstatisticalsignificanceinlabelfreequantitativeproteomicswithcombinedfilters