Cargando…

Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics

BACKGROUND: Mass spectrometry is a key technique in proteomics and can be used to analyze complex samples quickly. One key problem with the mass spectrometric analysis of peptides and proteins, however, is the fact that absolute quantification is severely hampered by the unclear relationship between...

Descripción completa

Detalles Bibliográficos
Autores principales: Timm, Wiebke, Scherbart, Alexandra, Böcker, Sebastian, Kohlbacher, Oliver, Nattkemper, Tim W
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2600826/
https://www.ncbi.nlm.nih.gov/pubmed/18937839
http://dx.doi.org/10.1186/1471-2105-9-443
_version_ 1782162215113064448
author Timm, Wiebke
Scherbart, Alexandra
Böcker, Sebastian
Kohlbacher, Oliver
Nattkemper, Tim W
author_facet Timm, Wiebke
Scherbart, Alexandra
Böcker, Sebastian
Kohlbacher, Oliver
Nattkemper, Tim W
author_sort Timm, Wiebke
collection PubMed
description BACKGROUND: Mass spectrometry is a key technique in proteomics and can be used to analyze complex samples quickly. One key problem with the mass spectrometric analysis of peptides and proteins, however, is the fact that absolute quantification is severely hampered by the unclear relationship between the observed peak intensity and the peptide concentration in the sample. While there are numerous approaches to circumvent this problem experimentally (e.g. labeling techniques), reliable prediction of the peak intensities from peptide sequences could provide a peptide-specific correction factor. Thus, it would be a valuable tool towards label-free absolute quantification. RESULTS: In this work we present machine learning techniques for peak intensity prediction for MALDI mass spectra. Features encoding the peptides' physico-chemical properties as well as string-based features were extracted. A feature subset was obtained from multiple forward feature selections on the extracted features. Based on these features, two advanced machine learning methods (support vector regression and local linear maps) are shown to yield good results for this problem (Pearson correlation of 0.68 in a ten-fold cross validation). CONCLUSION: The techniques presented here are a useful first step going beyond the binary prediction of proteotypic peptides towards a more quantitative prediction of peak intensities. These predictions in turn will turn out to be beneficial for mass spectrometry-based quantitative proteomics.
format Text
id pubmed-2600826
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26008262008-12-15 Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics Timm, Wiebke Scherbart, Alexandra Böcker, Sebastian Kohlbacher, Oliver Nattkemper, Tim W BMC Bioinformatics Methodology Article BACKGROUND: Mass spectrometry is a key technique in proteomics and can be used to analyze complex samples quickly. One key problem with the mass spectrometric analysis of peptides and proteins, however, is the fact that absolute quantification is severely hampered by the unclear relationship between the observed peak intensity and the peptide concentration in the sample. While there are numerous approaches to circumvent this problem experimentally (e.g. labeling techniques), reliable prediction of the peak intensities from peptide sequences could provide a peptide-specific correction factor. Thus, it would be a valuable tool towards label-free absolute quantification. RESULTS: In this work we present machine learning techniques for peak intensity prediction for MALDI mass spectra. Features encoding the peptides' physico-chemical properties as well as string-based features were extracted. A feature subset was obtained from multiple forward feature selections on the extracted features. Based on these features, two advanced machine learning methods (support vector regression and local linear maps) are shown to yield good results for this problem (Pearson correlation of 0.68 in a ten-fold cross validation). CONCLUSION: The techniques presented here are a useful first step going beyond the binary prediction of proteotypic peptides towards a more quantitative prediction of peak intensities. These predictions in turn will turn out to be beneficial for mass spectrometry-based quantitative proteomics. BioMed Central 2008-10-20 /pmc/articles/PMC2600826/ /pubmed/18937839 http://dx.doi.org/10.1186/1471-2105-9-443 Text en Copyright © 2008 Timm et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Timm, Wiebke
Scherbart, Alexandra
Böcker, Sebastian
Kohlbacher, Oliver
Nattkemper, Tim W
Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics
title Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics
title_full Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics
title_fullStr Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics
title_full_unstemmed Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics
title_short Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics
title_sort peak intensity prediction in maldi-tof mass spectrometry: a machine learning study to support quantitative proteomics
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2600826/
https://www.ncbi.nlm.nih.gov/pubmed/18937839
http://dx.doi.org/10.1186/1471-2105-9-443
work_keys_str_mv AT timmwiebke peakintensitypredictioninmalditofmassspectrometryamachinelearningstudytosupportquantitativeproteomics
AT scherbartalexandra peakintensitypredictioninmalditofmassspectrometryamachinelearningstudytosupportquantitativeproteomics
AT bockersebastian peakintensitypredictioninmalditofmassspectrometryamachinelearningstudytosupportquantitativeproteomics
AT kohlbacheroliver peakintensitypredictioninmalditofmassspectrometryamachinelearningstudytosupportquantitativeproteomics
AT nattkempertimw peakintensitypredictioninmalditofmassspectrometryamachinelearningstudytosupportquantitativeproteomics