Cargando…

Quality assessment and interference detection in targeted mass spectrometry data using machine learning

Advances in the field of targeted proteomics and mass spectrometry have significantly improved assay sensitivity and multiplexing capacity. The high-throughput nature of targeted proteomics experiments has increased the rate of data production, which requires development of novel analytical tools to...

Descripción completa

Detalles Bibliográficos
Autores principales: Toghi Eshghi, Shadi, Auger, Paul, Mathews, W. Rodney
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6173846/
https://www.ncbi.nlm.nih.gov/pubmed/30323719
http://dx.doi.org/10.1186/s12014-018-9209-x
_version_ 1783361195121573888
author Toghi Eshghi, Shadi
Auger, Paul
Mathews, W. Rodney
author_facet Toghi Eshghi, Shadi
Auger, Paul
Mathews, W. Rodney
author_sort Toghi Eshghi, Shadi
collection PubMed
description Advances in the field of targeted proteomics and mass spectrometry have significantly improved assay sensitivity and multiplexing capacity. The high-throughput nature of targeted proteomics experiments has increased the rate of data production, which requires development of novel analytical tools to keep up with data processing demand. Currently, development and validation of targeted mass spectrometry assays require manual inspection of chromatographic peaks from large datasets to ensure quality, a process that is time consuming, prone to inter- and intra-operator variability and limits the efficiency of data interpretation from targeted proteomics analyses. To address this challenge, we have developed TargetedMSQC, an R package that facilitates quality control and verification of chromatographic peaks from targeted proteomics datasets. This tool calculates metrics to quantify several quality aspects of a chromatographic peak, e.g. symmetry, jaggedness and modality, co-elution and shape similarity of monitored transitions in a peak group, as well as the consistency of transitions’ ratios between endogenous analytes and isotopically labeled internal standards and consistency of retention time across multiple runs. The algorithm takes advantage of supervised machine learning to identify peaks with interference or poor chromatography based on a set of peaks that have been annotated by an expert analyst. Using TargetedMSQC to analyze targeted proteomics data reduces the time spent on manual inspection of peaks and improves both speed and accuracy of interference detection. Additionally, by allowing the analysts to customize the tool for application on different datasets, TargetedMSQC gives the users the flexibility to define the acceptable quality for specific datasets. Furthermore, automated and quantitative assessment of peak quality offers a more objective and systematic framework for high throughput analysis of targeted mass spectrometry assay datasets and is a step towards more robust and faster assay implementation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12014-018-9209-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6173846
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-61738462018-10-15 Quality assessment and interference detection in targeted mass spectrometry data using machine learning Toghi Eshghi, Shadi Auger, Paul Mathews, W. Rodney Clin Proteomics Research Advances in the field of targeted proteomics and mass spectrometry have significantly improved assay sensitivity and multiplexing capacity. The high-throughput nature of targeted proteomics experiments has increased the rate of data production, which requires development of novel analytical tools to keep up with data processing demand. Currently, development and validation of targeted mass spectrometry assays require manual inspection of chromatographic peaks from large datasets to ensure quality, a process that is time consuming, prone to inter- and intra-operator variability and limits the efficiency of data interpretation from targeted proteomics analyses. To address this challenge, we have developed TargetedMSQC, an R package that facilitates quality control and verification of chromatographic peaks from targeted proteomics datasets. This tool calculates metrics to quantify several quality aspects of a chromatographic peak, e.g. symmetry, jaggedness and modality, co-elution and shape similarity of monitored transitions in a peak group, as well as the consistency of transitions’ ratios between endogenous analytes and isotopically labeled internal standards and consistency of retention time across multiple runs. The algorithm takes advantage of supervised machine learning to identify peaks with interference or poor chromatography based on a set of peaks that have been annotated by an expert analyst. Using TargetedMSQC to analyze targeted proteomics data reduces the time spent on manual inspection of peaks and improves both speed and accuracy of interference detection. Additionally, by allowing the analysts to customize the tool for application on different datasets, TargetedMSQC gives the users the flexibility to define the acceptable quality for specific datasets. Furthermore, automated and quantitative assessment of peak quality offers a more objective and systematic framework for high throughput analysis of targeted mass spectrometry assay datasets and is a step towards more robust and faster assay implementation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12014-018-9209-x) contains supplementary material, which is available to authorized users. BioMed Central 2018-10-06 /pmc/articles/PMC6173846/ /pubmed/30323719 http://dx.doi.org/10.1186/s12014-018-9209-x Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Toghi Eshghi, Shadi
Auger, Paul
Mathews, W. Rodney
Quality assessment and interference detection in targeted mass spectrometry data using machine learning
title Quality assessment and interference detection in targeted mass spectrometry data using machine learning
title_full Quality assessment and interference detection in targeted mass spectrometry data using machine learning
title_fullStr Quality assessment and interference detection in targeted mass spectrometry data using machine learning
title_full_unstemmed Quality assessment and interference detection in targeted mass spectrometry data using machine learning
title_short Quality assessment and interference detection in targeted mass spectrometry data using machine learning
title_sort quality assessment and interference detection in targeted mass spectrometry data using machine learning
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6173846/
https://www.ncbi.nlm.nih.gov/pubmed/30323719
http://dx.doi.org/10.1186/s12014-018-9209-x
work_keys_str_mv AT toghieshghishadi qualityassessmentandinterferencedetectionintargetedmassspectrometrydatausingmachinelearning
AT augerpaul qualityassessmentandinterferencedetectionintargetedmassspectrometrydatausingmachinelearning
AT mathewswrodney qualityassessmentandinterferencedetectionintargetedmassspectrometrydatausingmachinelearning