Cargando…

A note on the false discovery rate of novel peptides in proteogenomics

Motivation: Proteogenomics has been well accepted as a tool to discover novel genes. In most conventional proteogenomic studies, a global false discovery rate is used to filter out false positives for identifying credible novel peptides. However, it has been found that the actual level of false posi...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Kun, Fu, Yan, Zeng, Wen-Feng, He, Kun, Chi, Hao, Liu, Chao, Li, Yan-Chang, Gao, Yuan, Xu, Ping, He, Si-Min
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4595894/
https://www.ncbi.nlm.nih.gov/pubmed/26076724
http://dx.doi.org/10.1093/bioinformatics/btv340
_version_ 1782393690923204608
author Zhang, Kun
Fu, Yan
Zeng, Wen-Feng
He, Kun
Chi, Hao
Liu, Chao
Li, Yan-Chang
Gao, Yuan
Xu, Ping
He, Si-Min
author_facet Zhang, Kun
Fu, Yan
Zeng, Wen-Feng
He, Kun
Chi, Hao
Liu, Chao
Li, Yan-Chang
Gao, Yuan
Xu, Ping
He, Si-Min
author_sort Zhang, Kun
collection PubMed
description Motivation: Proteogenomics has been well accepted as a tool to discover novel genes. In most conventional proteogenomic studies, a global false discovery rate is used to filter out false positives for identifying credible novel peptides. However, it has been found that the actual level of false positives in novel peptides is often out of control and behaves differently for different genomes. Results: To quantitatively model this problem, we theoretically analyze the subgroup false discovery rates of annotated and novel peptides. Our analysis shows that the annotation completeness ratio of a genome is the dominant factor influencing the subgroup FDR of novel peptides. Experimental results on two real datasets of Escherichia coli and Mycobacterium tuberculosis support our conjecture. Contact: yfu@amss.ac.cn or xupingghy@gmail.com or smhe@ict.ac.cn Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4595894
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-45958942015-10-09 A note on the false discovery rate of novel peptides in proteogenomics Zhang, Kun Fu, Yan Zeng, Wen-Feng He, Kun Chi, Hao Liu, Chao Li, Yan-Chang Gao, Yuan Xu, Ping He, Si-Min Bioinformatics Discovery Note Motivation: Proteogenomics has been well accepted as a tool to discover novel genes. In most conventional proteogenomic studies, a global false discovery rate is used to filter out false positives for identifying credible novel peptides. However, it has been found that the actual level of false positives in novel peptides is often out of control and behaves differently for different genomes. Results: To quantitatively model this problem, we theoretically analyze the subgroup false discovery rates of annotated and novel peptides. Our analysis shows that the annotation completeness ratio of a genome is the dominant factor influencing the subgroup FDR of novel peptides. Experimental results on two real datasets of Escherichia coli and Mycobacterium tuberculosis support our conjecture. Contact: yfu@amss.ac.cn or xupingghy@gmail.com or smhe@ict.ac.cn Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2015-10-15 2015-06-14 /pmc/articles/PMC4595894/ /pubmed/26076724 http://dx.doi.org/10.1093/bioinformatics/btv340 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Discovery Note
Zhang, Kun
Fu, Yan
Zeng, Wen-Feng
He, Kun
Chi, Hao
Liu, Chao
Li, Yan-Chang
Gao, Yuan
Xu, Ping
He, Si-Min
A note on the false discovery rate of novel peptides in proteogenomics
title A note on the false discovery rate of novel peptides in proteogenomics
title_full A note on the false discovery rate of novel peptides in proteogenomics
title_fullStr A note on the false discovery rate of novel peptides in proteogenomics
title_full_unstemmed A note on the false discovery rate of novel peptides in proteogenomics
title_short A note on the false discovery rate of novel peptides in proteogenomics
title_sort note on the false discovery rate of novel peptides in proteogenomics
topic Discovery Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4595894/
https://www.ncbi.nlm.nih.gov/pubmed/26076724
http://dx.doi.org/10.1093/bioinformatics/btv340
work_keys_str_mv AT zhangkun anoteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT fuyan anoteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT zengwenfeng anoteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT hekun anoteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT chihao anoteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT liuchao anoteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT liyanchang anoteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT gaoyuan anoteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT xuping anoteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT hesimin anoteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT zhangkun noteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT fuyan noteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT zengwenfeng noteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT hekun noteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT chihao noteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT liuchao noteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT liyanchang noteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT gaoyuan noteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT xuping noteonthefalsediscoveryrateofnovelpeptidesinproteogenomics
AT hesimin noteonthefalsediscoveryrateofnovelpeptidesinproteogenomics