Cargando…

From Centroided to Profile Mode: Machine Learning for Prediction of Peak Width in HRMS Data

[Image: see text] Centroiding is one of the major approaches used for size reduction of the data generated by high-resolution mass spectrometry. During centroiding, performed either during acquisition or as a pre-processing step, the mass profiles are represented by a single value (i.e., the centroi...

Descripción completa

Detalles Bibliográficos
Autores principales: Samanipour, Saer, Choi, Phil, O’Brien, Jake W., Pirok, Bob W. J., Reid, Malcolm J., Thomas, Kevin V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8674881/
https://www.ncbi.nlm.nih.gov/pubmed/34843646
http://dx.doi.org/10.1021/acs.analchem.1c03755
_version_ 1784615765465366528
author Samanipour, Saer
Choi, Phil
O’Brien, Jake W.
Pirok, Bob W. J.
Reid, Malcolm J.
Thomas, Kevin V.
author_facet Samanipour, Saer
Choi, Phil
O’Brien, Jake W.
Pirok, Bob W. J.
Reid, Malcolm J.
Thomas, Kevin V.
author_sort Samanipour, Saer
collection PubMed
description [Image: see text] Centroiding is one of the major approaches used for size reduction of the data generated by high-resolution mass spectrometry. During centroiding, performed either during acquisition or as a pre-processing step, the mass profiles are represented by a single value (i.e., the centroid). While being effective in reducing the data size, centroiding also reduces the level of information density present in the mass peak profile. Moreover, each step of the centroiding process and their consequences on the final results may not be completely clear. Here, we present Cent2Prof, a package containing two algorithms that enables the conversion of the centroided data to mass peak profile data and vice versa. The centroiding algorithm uses the resolution-based mass peak width parameter as the first guess and self-adjusts to fit the data. In addition to the m/z values, the centroiding algorithm also generates the measured mass peak widths at half-height, which can be used during the feature detection and identification. The mass peak profile prediction algorithm employs a random-forest model for the prediction of mass peak widths, which is consequently used for mass profile reconstruction. The centroiding results were compared to the outputs of the MZmine-implemented centroiding algorithm. Our algorithm resulted in rates of false detection ≤5% while the MZmine algorithm resulted in 30% rate of false positive and 3% rate of false negative. The error in profile prediction was ≤56% independent of the mass, ionization mode, and intensity, which was 6 times more accurate than the resolution-based estimated values.
format Online
Article
Text
id pubmed-8674881
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-86748812021-12-17 From Centroided to Profile Mode: Machine Learning for Prediction of Peak Width in HRMS Data Samanipour, Saer Choi, Phil O’Brien, Jake W. Pirok, Bob W. J. Reid, Malcolm J. Thomas, Kevin V. Anal Chem [Image: see text] Centroiding is one of the major approaches used for size reduction of the data generated by high-resolution mass spectrometry. During centroiding, performed either during acquisition or as a pre-processing step, the mass profiles are represented by a single value (i.e., the centroid). While being effective in reducing the data size, centroiding also reduces the level of information density present in the mass peak profile. Moreover, each step of the centroiding process and their consequences on the final results may not be completely clear. Here, we present Cent2Prof, a package containing two algorithms that enables the conversion of the centroided data to mass peak profile data and vice versa. The centroiding algorithm uses the resolution-based mass peak width parameter as the first guess and self-adjusts to fit the data. In addition to the m/z values, the centroiding algorithm also generates the measured mass peak widths at half-height, which can be used during the feature detection and identification. The mass peak profile prediction algorithm employs a random-forest model for the prediction of mass peak widths, which is consequently used for mass profile reconstruction. The centroiding results were compared to the outputs of the MZmine-implemented centroiding algorithm. Our algorithm resulted in rates of false detection ≤5% while the MZmine algorithm resulted in 30% rate of false positive and 3% rate of false negative. The error in profile prediction was ≤56% independent of the mass, ionization mode, and intensity, which was 6 times more accurate than the resolution-based estimated values. American Chemical Society 2021-11-29 2021-12-14 /pmc/articles/PMC8674881/ /pubmed/34843646 http://dx.doi.org/10.1021/acs.analchem.1c03755 Text en © 2021 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Samanipour, Saer
Choi, Phil
O’Brien, Jake W.
Pirok, Bob W. J.
Reid, Malcolm J.
Thomas, Kevin V.
From Centroided to Profile Mode: Machine Learning for Prediction of Peak Width in HRMS Data
title From Centroided to Profile Mode: Machine Learning for Prediction of Peak Width in HRMS Data
title_full From Centroided to Profile Mode: Machine Learning for Prediction of Peak Width in HRMS Data
title_fullStr From Centroided to Profile Mode: Machine Learning for Prediction of Peak Width in HRMS Data
title_full_unstemmed From Centroided to Profile Mode: Machine Learning for Prediction of Peak Width in HRMS Data
title_short From Centroided to Profile Mode: Machine Learning for Prediction of Peak Width in HRMS Data
title_sort from centroided to profile mode: machine learning for prediction of peak width in hrms data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8674881/
https://www.ncbi.nlm.nih.gov/pubmed/34843646
http://dx.doi.org/10.1021/acs.analchem.1c03755
work_keys_str_mv AT samanipoursaer fromcentroidedtoprofilemodemachinelearningforpredictionofpeakwidthinhrmsdata
AT choiphil fromcentroidedtoprofilemodemachinelearningforpredictionofpeakwidthinhrmsdata
AT obrienjakew fromcentroidedtoprofilemodemachinelearningforpredictionofpeakwidthinhrmsdata
AT pirokbobwj fromcentroidedtoprofilemodemachinelearningforpredictionofpeakwidthinhrmsdata
AT reidmalcolmj fromcentroidedtoprofilemodemachinelearningforpredictionofpeakwidthinhrmsdata
AT thomaskevinv fromcentroidedtoprofilemodemachinelearningforpredictionofpeakwidthinhrmsdata