Cargando…
A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model
In proteomics, it is important to detect, analyze, and quantify complex peptide components and differences. The key is to match the elution time peaks (LC peaks) produced by the same peptide in replicate experiments. Warping functions are currently widely used to correct the mean of time shifts amon...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society of Chemistry
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9071103/ https://www.ncbi.nlm.nih.gov/pubmed/35530479 http://dx.doi.org/10.1039/c9ra03789f |
_version_ | 1784700777706553344 |
---|---|
author | Cui, Jian Chen, Qiang Dong, Xiaorui Shang, Kai Qi, Xin Cui, Hao |
author_facet | Cui, Jian Chen, Qiang Dong, Xiaorui Shang, Kai Qi, Xin Cui, Hao |
author_sort | Cui, Jian |
collection | PubMed |
description | In proteomics, it is important to detect, analyze, and quantify complex peptide components and differences. The key is to match the elution time peaks (LC peaks) produced by the same peptide in replicate experiments. Warping functions are currently widely used to correct the mean of time shifts among replicates. However, they cannot reduce the ambiguity to distinguish the corresponding peak pairs and the non-corresponding ones because the time shifts are random based on each extracted-ion-chromatogram (XIC). In this paper, besides time feature, isotope distribution pattern similarity is considered. The novelty is that compared with other feature based methods including the isotope feature, the algorithm is not based on the peak profile similarity as usual, but on the isotope distribution similarity. First, the training set of peptides including the corresponding and non-corresponding peak pairs were selected from the MS/MS results. Second, we generated time difference and isotope distribution pattern similarities for each peak pair. Third, Support Vector Machine (SVM) classification was used based on the two features. Finally, the accuracy was measured along with final coverage. We first used a 10-fold cross validation to test the effectiveness of the SVM learning model. The accuracy of correct matching could reach 97%. Second, we evaluated the coverage based on the learning model, which could be from 75% to 91% in different datasets. Thus, this matching algorithm based on time and isotope distribution pattern features could provide a high accuracy and coverage for the corresponding peak identification. |
format | Online Article Text |
id | pubmed-9071103 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | The Royal Society of Chemistry |
record_format | MEDLINE/PubMed |
spelling | pubmed-90711032022-05-06 A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model Cui, Jian Chen, Qiang Dong, Xiaorui Shang, Kai Qi, Xin Cui, Hao RSC Adv Chemistry In proteomics, it is important to detect, analyze, and quantify complex peptide components and differences. The key is to match the elution time peaks (LC peaks) produced by the same peptide in replicate experiments. Warping functions are currently widely used to correct the mean of time shifts among replicates. However, they cannot reduce the ambiguity to distinguish the corresponding peak pairs and the non-corresponding ones because the time shifts are random based on each extracted-ion-chromatogram (XIC). In this paper, besides time feature, isotope distribution pattern similarity is considered. The novelty is that compared with other feature based methods including the isotope feature, the algorithm is not based on the peak profile similarity as usual, but on the isotope distribution similarity. First, the training set of peptides including the corresponding and non-corresponding peak pairs were selected from the MS/MS results. Second, we generated time difference and isotope distribution pattern similarities for each peak pair. Third, Support Vector Machine (SVM) classification was used based on the two features. Finally, the accuracy was measured along with final coverage. We first used a 10-fold cross validation to test the effectiveness of the SVM learning model. The accuracy of correct matching could reach 97%. Second, we evaluated the coverage based on the learning model, which could be from 75% to 91% in different datasets. Thus, this matching algorithm based on time and isotope distribution pattern features could provide a high accuracy and coverage for the corresponding peak identification. The Royal Society of Chemistry 2019-09-04 /pmc/articles/PMC9071103/ /pubmed/35530479 http://dx.doi.org/10.1039/c9ra03789f Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by-nc/3.0/ |
spellingShingle | Chemistry Cui, Jian Chen, Qiang Dong, Xiaorui Shang, Kai Qi, Xin Cui, Hao A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model |
title | A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model |
title_full | A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model |
title_fullStr | A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model |
title_full_unstemmed | A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model |
title_short | A matching algorithm with isotope distribution pattern in LC-MS based on support vector machine (SVM) learning model |
title_sort | matching algorithm with isotope distribution pattern in lc-ms based on support vector machine (svm) learning model |
topic | Chemistry |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9071103/ https://www.ncbi.nlm.nih.gov/pubmed/35530479 http://dx.doi.org/10.1039/c9ra03789f |
work_keys_str_mv | AT cuijian amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT chenqiang amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT dongxiaorui amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT shangkai amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT qixin amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT cuihao amatchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT cuijian matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT chenqiang matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT dongxiaorui matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT shangkai matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT qixin matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel AT cuihao matchingalgorithmwithisotopedistributionpatterninlcmsbasedonsupportvectormachinesvmlearningmodel |