Cargando…

A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model

In proteomics, it is important to detect, analyze, and quantify complex peptide components and differences. The key is to match the elution time peaks (LC peaks) produced by the same peptide in replicate experiments. Warping functions are currently widely used to correct the mean of time shifts amon...

Descripción completa

Detalles Bibliográficos
Autores principales: Cui, Jian, Chen, Qiang, Dong, Xiaorui, Shang, Kai, Qi, Xin, Cui, Hao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9071103/
https://www.ncbi.nlm.nih.gov/pubmed/35530479
http://dx.doi.org/10.1039/c9ra03789f
_version_ 1784700777706553344
author Cui, Jian
Chen, Qiang
Dong, Xiaorui
Shang, Kai
Qi, Xin
Cui, Hao
author_facet Cui, Jian
Chen, Qiang
Dong, Xiaorui
Shang, Kai
Qi, Xin
Cui, Hao
author_sort Cui, Jian
collection PubMed
description In proteomics, it is important to detect, analyze, and quantify complex peptide components and differences. The key is to match the elution time peaks (LC peaks) produced by the same peptide in replicate experiments. Warping functions are currently widely used to correct the mean of time shifts among replicates. However, they cannot reduce the ambiguity to distinguish the corresponding peak pairs and the non-corresponding ones because the time shifts are random based on each extracted-ion-chromatogram (XIC). In this paper, besides time feature, isotope distribution pattern similarity is considered. The novelty is that compared with other feature based methods including the isotope feature, the algorithm is not based on the peak profile similarity as usual, but on the isotope distribution similarity. First, the training set of peptides including the corresponding and non-corresponding peak pairs were selected from the MS/MS results. Second, we generated time difference and isotope distribution pattern similarities for each peak pair. Third, Support Vector Machine (SVM) classification was used based on the two features. Finally, the accuracy was measured along with final coverage. We first used a 10-fold cross validation to test the effectiveness of the SVM learning model. The accuracy of correct matching could reach 97%. Second, we evaluated the coverage based on the learning model, which could be from 75% to 91% in different datasets. Thus, this matching algorithm based on time and isotope distribution pattern features could provide a high accuracy and coverage for the corresponding peak identification.
format Online
Article
Text
id pubmed-9071103
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-90711032022-05-06 A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model Cui, Jian Chen, Qiang Dong, Xiaorui Shang, Kai Qi, Xin Cui, Hao RSC Adv Chemistry In proteomics, it is important to detect, analyze, and quantify complex peptide components and differences. The key is to match the elution time peaks (LC peaks) produced by the same peptide in replicate experiments. Warping functions are currently widely used to correct the mean of time shifts among replicates. However, they cannot reduce the ambiguity to distinguish the corresponding peak pairs and the non-corresponding ones because the time shifts are random based on each extracted-ion-chromatogram (XIC). In this paper, besides time feature, isotope distribution pattern similarity is considered. The novelty is that compared with other feature based methods including the isotope feature, the algorithm is not based on the peak profile similarity as usual, but on the isotope distribution similarity. First, the training set of peptides including the corresponding and non-corresponding peak pairs were selected from the MS/MS results. Second, we generated time difference and isotope distribution pattern similarities for each peak pair. Third, Support Vector Machine (SVM) classification was used based on the two features. Finally, the accuracy was measured along with final coverage. We first used a 10-fold cross validation to test the effectiveness of the SVM learning model. The accuracy of correct matching could reach 97%. Second, we evaluated the coverage based on the learning model, which could be from 75% to 91% in different datasets. Thus, this matching algorithm based on time and isotope distribution pattern features could provide a high accuracy and coverage for the corresponding peak identification. The Royal Society of Chemistry 2019-09-04 /pmc/articles/PMC9071103/ /pubmed/35530479 http://dx.doi.org/10.1039/c9ra03789f Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by-nc/3.0/
spellingShingle Chemistry
Cui, Jian
Chen, Qiang
Dong, Xiaorui
Shang, Kai
Qi, Xin
Cui, Hao
A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model
title A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model
title_full A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model
title_fullStr A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model
title_full_unstemmed A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model
title_short A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model
title_sort matching algorithm with isotope distribution pattern in lc-ms based on support vector machine (svm) learning model
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9071103/
https://www.ncbi.nlm.nih.gov/pubmed/35530479
http://dx.doi.org/10.1039/c9ra03789f
work_keys_str_mv AT cuijian amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT chenqiang amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT dongxiaorui amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT shangkai amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT qixin amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT cuihao amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT cuijian matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT chenqiang matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT dongxiaorui matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT shangkai matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT qixin matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel
AT cuihao matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel