Cargando…

Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration

BACKGROUND: Protein identification using mass spectrometry is an important tool in many areas of the life sciences, and in proteomics research in particular. Increasing the number of proteins correctly identified is dependent on the ability to include new knowledge about the mass spectrometry fragme...

Descripción completa

Detalles Bibliográficos
Autores principales:	McHugh, Leo C, Arthur, Jonathan W
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2941693/ https://www.ncbi.nlm.nih.gov/pubmed/20815925 http://dx.doi.org/10.1186/1471-2105-11-448

_version_	1782186927790751744
author	McHugh, Leo C Arthur, Jonathan W
author_facet	McHugh, Leo C Arthur, Jonathan W
author_sort	McHugh, Leo C
collection	PubMed
description	BACKGROUND: Protein identification using mass spectrometry is an important tool in many areas of the life sciences, and in proteomics research in particular. Increasing the number of proteins correctly identified is dependent on the ability to include new knowledge about the mass spectrometry fragmentation process, into computational algorithms designed to separate true matches of peptides to unidentified mass spectra from spurious matches. This discrimination is achieved by computing a function of the various features of the potential match between the observed and theoretical spectra to give a numerical approximation of their similarity. It is these underlying "metrics" that determine the ability of a protein identification package to maximise correct identifications while limiting false discovery rates. There is currently no software available specifically for the simple implementation and analysis of arbitrary novel metrics for peptide matching and for the exploration of fragmentation patterns for a given dataset. RESULTS: We present Harvest: an open source software tool for analysing fragmentation patterns and assessing the power of a new piece of information about the MS/MS fragmentation process to more clearly differentiate between correct and random peptide assignments. We demonstrate this functionality using data metrics derived from the properties of individual datasets in a peptide identification context. Using Harvest, we demonstrate how the development of such metrics may improve correct peptide assignment confidence in the context of a high-throughput proteomics experiment and characterise properties of peptide fragmentation. CONCLUSIONS: Harvest provides a simple framework in C++ for analysing and prototyping metrics for peptide matching, the core of the protein identification problem. It is not a protein identification package and answers a different research question to packages such as Sequest, Mascot, X!Tandem, and other protein identification packages. It does not aim to maximise the number of assigned peptides from a set of unknown spectra, but instead provides a method by which researchers can explore fragmentation properties and assess the power of novel metrics for peptide matching in the context of a given experiment. Metrics developed using Harvest may then become candidates for later integration into protein identification packages.
format	Text
id	pubmed-2941693
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-29416932010-09-30 Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration McHugh, Leo C Arthur, Jonathan W BMC Bioinformatics Software BACKGROUND: Protein identification using mass spectrometry is an important tool in many areas of the life sciences, and in proteomics research in particular. Increasing the number of proteins correctly identified is dependent on the ability to include new knowledge about the mass spectrometry fragmentation process, into computational algorithms designed to separate true matches of peptides to unidentified mass spectra from spurious matches. This discrimination is achieved by computing a function of the various features of the potential match between the observed and theoretical spectra to give a numerical approximation of their similarity. It is these underlying "metrics" that determine the ability of a protein identification package to maximise correct identifications while limiting false discovery rates. There is currently no software available specifically for the simple implementation and analysis of arbitrary novel metrics for peptide matching and for the exploration of fragmentation patterns for a given dataset. RESULTS: We present Harvest: an open source software tool for analysing fragmentation patterns and assessing the power of a new piece of information about the MS/MS fragmentation process to more clearly differentiate between correct and random peptide assignments. We demonstrate this functionality using data metrics derived from the properties of individual datasets in a peptide identification context. Using Harvest, we demonstrate how the development of such metrics may improve correct peptide assignment confidence in the context of a high-throughput proteomics experiment and characterise properties of peptide fragmentation. CONCLUSIONS: Harvest provides a simple framework in C++ for analysing and prototyping metrics for peptide matching, the core of the protein identification problem. It is not a protein identification package and answers a different research question to packages such as Sequest, Mascot, X!Tandem, and other protein identification packages. It does not aim to maximise the number of assigned peptides from a set of unknown spectra, but instead provides a method by which researchers can explore fragmentation properties and assess the power of novel metrics for peptide matching in the context of a given experiment. Metrics developed using Harvest may then become candidates for later integration into protein identification packages. BioMed Central 2010-09-06 /pmc/articles/PMC2941693/ /pubmed/20815925 http://dx.doi.org/10.1186/1471-2105-11-448 Text en Copyright ©2010 McHugh and Arthur; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software McHugh, Leo C Arthur, Jonathan W Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration
title	Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration
title_full	Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration
title_fullStr	Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration
title_full_unstemmed	Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration
title_short	Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration
title_sort	harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2941693/ https://www.ncbi.nlm.nih.gov/pubmed/20815925 http://dx.doi.org/10.1186/1471-2105-11-448
work_keys_str_mv	AT mchughleoc harvestanopensourcetoolforthevalidationandimprovementofpeptideidentificationmetricsandfragmentationexploration AT arthurjonathanw harvestanopensourcetoolforthevalidationandimprovementofpeptideidentificationmetricsandfragmentationexploration

Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration

Ejemplares similares