Cargando…

Filtering of MS/MS data for peptide identification

BACKGROUND: The identification of proteins based on analysis of tandem mass spectrometry (MS/MS) data is a valuable tool that is not fully realized because of the difficulty in carrying out automated analysis of large numbers of spectra. MS/MS spectra consist of peaks that represent each peptide fra...

Descripción completa

Detalles Bibliográficos
Autores principales: Gallia, Jason, Lavrich, Katelyn, Tan-Wilson, Anna, Madden, Patrick H
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3817806/
https://www.ncbi.nlm.nih.gov/pubmed/24564329
http://dx.doi.org/10.1186/1471-2164-14-S7-S2
_version_ 1782478133700591616
author Gallia, Jason
Lavrich, Katelyn
Tan-Wilson, Anna
Madden, Patrick H
author_facet Gallia, Jason
Lavrich, Katelyn
Tan-Wilson, Anna
Madden, Patrick H
author_sort Gallia, Jason
collection PubMed
description BACKGROUND: The identification of proteins based on analysis of tandem mass spectrometry (MS/MS) data is a valuable tool that is not fully realized because of the difficulty in carrying out automated analysis of large numbers of spectra. MS/MS spectra consist of peaks that represent each peptide fragment, usually b and y ions, with experimentally determined mass to charge ratios. Whether the strategy employed is database matching or De Novo sequencing, a major obstacle is distinguishing signal from noise. Improved ability to distinguish signal peaks of low intensity from background noise increases the likelihood of correctly identifying the peptide, as valuable information is preserved while extraneous information is not left to mislead. RESULTS: This paper introduces an automated noise filtering method based on the construction of orthogonal polynomials. By subdividing the spectrum into a variable number (3 to 11) of bins, peaks that are considered "noise" are identified at a local level. Using a De Novo sequencing algorithm that we are developing, this filtering method was applied to a published dataset of more than 3000 mass spectra and an original dataset of more than 300 spectra. The samples were peptides from purified known proteins; therefore, the solutions could be compared to the correct sequences and the peaks corresponding to b, y and other fragments of significance could be identified. The same procedure was applied using two other published filtering methods. The ratios of the number of significant peaks that were preserved relative to the total number of peaks in each spectrum were determined. In the event that filtering out too many or too few signal peaks can lead to inaccuracy in sequence determination, the percentage of amino acid residues in the correct positions relative to the total number of amino acid residues in the correct sequence was also calculated for each sequence determined. CONCLUSIONS: The results show that an orthogonal polynomial-based method of distinguishing signal peaks from background in mass spectra preserves a greater portion of signal peaks than compared methods, improving accuracy in sequence determination.
format Online
Article
Text
id pubmed-3817806
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38178062013-11-07 Filtering of MS/MS data for peptide identification Gallia, Jason Lavrich, Katelyn Tan-Wilson, Anna Madden, Patrick H BMC Genomics Research BACKGROUND: The identification of proteins based on analysis of tandem mass spectrometry (MS/MS) data is a valuable tool that is not fully realized because of the difficulty in carrying out automated analysis of large numbers of spectra. MS/MS spectra consist of peaks that represent each peptide fragment, usually b and y ions, with experimentally determined mass to charge ratios. Whether the strategy employed is database matching or De Novo sequencing, a major obstacle is distinguishing signal from noise. Improved ability to distinguish signal peaks of low intensity from background noise increases the likelihood of correctly identifying the peptide, as valuable information is preserved while extraneous information is not left to mislead. RESULTS: This paper introduces an automated noise filtering method based on the construction of orthogonal polynomials. By subdividing the spectrum into a variable number (3 to 11) of bins, peaks that are considered "noise" are identified at a local level. Using a De Novo sequencing algorithm that we are developing, this filtering method was applied to a published dataset of more than 3000 mass spectra and an original dataset of more than 300 spectra. The samples were peptides from purified known proteins; therefore, the solutions could be compared to the correct sequences and the peaks corresponding to b, y and other fragments of significance could be identified. The same procedure was applied using two other published filtering methods. The ratios of the number of significant peaks that were preserved relative to the total number of peaks in each spectrum were determined. In the event that filtering out too many or too few signal peaks can lead to inaccuracy in sequence determination, the percentage of amino acid residues in the correct positions relative to the total number of amino acid residues in the correct sequence was also calculated for each sequence determined. CONCLUSIONS: The results show that an orthogonal polynomial-based method of distinguishing signal peaks from background in mass spectra preserves a greater portion of signal peaks than compared methods, improving accuracy in sequence determination. BioMed Central 2013-11-05 /pmc/articles/PMC3817806/ /pubmed/24564329 http://dx.doi.org/10.1186/1471-2164-14-S7-S2 Text en Copyright © 2013 Gallia et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Gallia, Jason
Lavrich, Katelyn
Tan-Wilson, Anna
Madden, Patrick H
Filtering of MS/MS data for peptide identification
title Filtering of MS/MS data for peptide identification
title_full Filtering of MS/MS data for peptide identification
title_fullStr Filtering of MS/MS data for peptide identification
title_full_unstemmed Filtering of MS/MS data for peptide identification
title_short Filtering of MS/MS data for peptide identification
title_sort filtering of ms/ms data for peptide identification
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3817806/
https://www.ncbi.nlm.nih.gov/pubmed/24564329
http://dx.doi.org/10.1186/1471-2164-14-S7-S2
work_keys_str_mv AT galliajason filteringofmsmsdataforpeptideidentification
AT lavrichkatelyn filteringofmsmsdataforpeptideidentification
AT tanwilsonanna filteringofmsmsdataforpeptideidentification
AT maddenpatrickh filteringofmsmsdataforpeptideidentification