Cargando…

Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process

BACKGROUND: The mass spectrometry based technical pipeline has provided a high-throughput, high-sensitivity and high-resolution platform for post-genomic biology. Varied models and algorithms are implemented by different tools to improve proteomics data analysis. The target-decoy searching strategy...

Descripción completa

Detalles Bibliográficos
Autores principales:	Feng, Xiao-dong, Li, Li-wei, Zhang, Jian-hong, Zhu, Yun-ping, Chang, Cheng, Shu, Kun-xian, Ma, Jie
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374549/ https://www.ncbi.nlm.nih.gov/pubmed/28361671 http://dx.doi.org/10.1186/s12864-017-3491-2

_version_	1782518909784555520
author	Feng, Xiao-dong Li, Li-wei Zhang, Jian-hong Zhu, Yun-ping Chang, Cheng Shu, Kun-xian Ma, Jie
author_facet	Feng, Xiao-dong Li, Li-wei Zhang, Jian-hong Zhu, Yun-ping Chang, Cheng Shu, Kun-xian Ma, Jie
author_sort	Feng, Xiao-dong
collection	PubMed
description	BACKGROUND: The mass spectrometry based technical pipeline has provided a high-throughput, high-sensitivity and high-resolution platform for post-genomic biology. Varied models and algorithms are implemented by different tools to improve proteomics data analysis. The target-decoy searching strategy has become the most popular strategy to control false identification in peptide and protein identifications. While this strategy can estimate the false discovery rate (FDR) within a dataset, it cannot directly evaluate the false positive matches in target identifications. RESULTS: As a supplement to target-decoy strategy, the entrapment sequence method was introduced to assess the key steps of mass spectrometry data analysis process, database search engines and quality control methods. Using the entrapment sequences as the standard, we evaluated five database search engines for both the origanal scores and reprocessed scores, as well as four quality control methods in term of quantity and quality aspects. Our results showed that the latest developed search engine MS-GF+ and percolator-embeded quality control method PepDistiller performed best in all tools respectively. Combined with efficient quality control methods, the search engines can improve the low sensitivity of their original scores. Moreover, based on the entrapment sequence method, we proved that filtering the identifications separately could increase the number of identified peptides while improving the confidence level. CONCLUSION: In this study, we have proved that the entrapment sequence method could be an useful strategy to assess the key steps of the mass spectrometry data analysis process. Its applications can be extended to all steps of the common workflow, such as the protein assembling methods and data integration methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3491-2) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5374549
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-53745492017-03-31 Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process Feng, Xiao-dong Li, Li-wei Zhang, Jian-hong Zhu, Yun-ping Chang, Cheng Shu, Kun-xian Ma, Jie BMC Genomics Research BACKGROUND: The mass spectrometry based technical pipeline has provided a high-throughput, high-sensitivity and high-resolution platform for post-genomic biology. Varied models and algorithms are implemented by different tools to improve proteomics data analysis. The target-decoy searching strategy has become the most popular strategy to control false identification in peptide and protein identifications. While this strategy can estimate the false discovery rate (FDR) within a dataset, it cannot directly evaluate the false positive matches in target identifications. RESULTS: As a supplement to target-decoy strategy, the entrapment sequence method was introduced to assess the key steps of mass spectrometry data analysis process, database search engines and quality control methods. Using the entrapment sequences as the standard, we evaluated five database search engines for both the origanal scores and reprocessed scores, as well as four quality control methods in term of quantity and quality aspects. Our results showed that the latest developed search engine MS-GF+ and percolator-embeded quality control method PepDistiller performed best in all tools respectively. Combined with efficient quality control methods, the search engines can improve the low sensitivity of their original scores. Moreover, based on the entrapment sequence method, we proved that filtering the identifications separately could increase the number of identified peptides while improving the confidence level. CONCLUSION: In this study, we have proved that the entrapment sequence method could be an useful strategy to assess the key steps of the mass spectrometry data analysis process. Its applications can be extended to all steps of the common workflow, such as the protein assembling methods and data integration methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3491-2) contains supplementary material, which is available to authorized users. BioMed Central 2017-03-14 /pmc/articles/PMC5374549/ /pubmed/28361671 http://dx.doi.org/10.1186/s12864-017-3491-2 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Feng, Xiao-dong Li, Li-wei Zhang, Jian-hong Zhu, Yun-ping Chang, Cheng Shu, Kun-xian Ma, Jie Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process
title	Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process
title_full	Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process
title_fullStr	Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process
title_full_unstemmed	Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process
title_short	Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process
title_sort	using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374549/ https://www.ncbi.nlm.nih.gov/pubmed/28361671 http://dx.doi.org/10.1186/s12864-017-3491-2
work_keys_str_mv	AT fengxiaodong usingtheentrapmentsequencemethodasastandardtoevaluatekeystepsofproteomicsdataanalysisprocess AT liliwei usingtheentrapmentsequencemethodasastandardtoevaluatekeystepsofproteomicsdataanalysisprocess AT zhangjianhong usingtheentrapmentsequencemethodasastandardtoevaluatekeystepsofproteomicsdataanalysisprocess AT zhuyunping usingtheentrapmentsequencemethodasastandardtoevaluatekeystepsofproteomicsdataanalysisprocess AT changcheng usingtheentrapmentsequencemethodasastandardtoevaluatekeystepsofproteomicsdataanalysisprocess AT shukunxian usingtheentrapmentsequencemethodasastandardtoevaluatekeystepsofproteomicsdataanalysisprocess AT majie usingtheentrapmentsequencemethodasastandardtoevaluatekeystepsofproteomicsdataanalysisprocess

Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process

Ejemplares similares