Cargando…

Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification

MOTIVATION: Identification of small molecules in a biological sample remains a major bottleneck in molecular biology, despite a decade of rapid development of computational approaches for predicting molecular structures using mass spectrometry (MS) data. Recently, there has been increasing interest...

Descripción completa

Detalles Bibliográficos
Autores principales: Bach, Eric, Rogers, Simon, Williamson, John, Rousu, Juho
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8289373/
https://www.ncbi.nlm.nih.gov/pubmed/33244585
http://dx.doi.org/10.1093/bioinformatics/btaa998
_version_ 1783724288555089920
author Bach, Eric
Rogers, Simon
Williamson, John
Rousu, Juho
author_facet Bach, Eric
Rogers, Simon
Williamson, John
Rousu, Juho
author_sort Bach, Eric
collection PubMed
description MOTIVATION: Identification of small molecules in a biological sample remains a major bottleneck in molecular biology, despite a decade of rapid development of computational approaches for predicting molecular structures using mass spectrometry (MS) data. Recently, there has been increasing interest in utilizing other information sources, such as liquid chromatography (LC) retention time (RT), to improve identifications solely based on MS information, such as precursor mass-per-charge and tandem mass spectrometry (MS(2)). RESULTS: We put forward a probabilistic modelling framework to integrate MS and RT data of multiple features in an LC-MS experiment. We model the MS measurements and all pairwise retention order information as a Markov random field and use efficient approximate inference for scoring and ranking potential molecular structures. Our experiments show improved identification accuracy by combining MS(2) data and retention orders using our approach, thereby outperforming state-of-the-art methods. Furthermore, we demonstrate the benefit of our model when only a subset of LC-MS features has MS(2) measurements available besides MS(1). AVAILABILITY AND IMPLEMENTATION: Software and data are freely available at https://github.com/aalto-ics-kepaco/msms_rt_score_integration. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8289373
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-82893732021-07-20 Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification Bach, Eric Rogers, Simon Williamson, John Rousu, Juho Bioinformatics Original Papers MOTIVATION: Identification of small molecules in a biological sample remains a major bottleneck in molecular biology, despite a decade of rapid development of computational approaches for predicting molecular structures using mass spectrometry (MS) data. Recently, there has been increasing interest in utilizing other information sources, such as liquid chromatography (LC) retention time (RT), to improve identifications solely based on MS information, such as precursor mass-per-charge and tandem mass spectrometry (MS(2)). RESULTS: We put forward a probabilistic modelling framework to integrate MS and RT data of multiple features in an LC-MS experiment. We model the MS measurements and all pairwise retention order information as a Markov random field and use efficient approximate inference for scoring and ranking potential molecular structures. Our experiments show improved identification accuracy by combining MS(2) data and retention orders using our approach, thereby outperforming state-of-the-art methods. Furthermore, we demonstrate the benefit of our model when only a subset of LC-MS features has MS(2) measurements available besides MS(1). AVAILABILITY AND IMPLEMENTATION: Software and data are freely available at https://github.com/aalto-ics-kepaco/msms_rt_score_integration. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-11-27 /pmc/articles/PMC8289373/ /pubmed/33244585 http://dx.doi.org/10.1093/bioinformatics/btaa998 Text en © The Author(s) 2020. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Bach, Eric
Rogers, Simon
Williamson, John
Rousu, Juho
Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification
title Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification
title_full Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification
title_fullStr Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification
title_full_unstemmed Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification
title_short Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification
title_sort probabilistic framework for integration of mass spectrum and retention time information in small molecule identification
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8289373/
https://www.ncbi.nlm.nih.gov/pubmed/33244585
http://dx.doi.org/10.1093/bioinformatics/btaa998
work_keys_str_mv AT bacheric probabilisticframeworkforintegrationofmassspectrumandretentiontimeinformationinsmallmoleculeidentification
AT rogerssimon probabilisticframeworkforintegrationofmassspectrumandretentiontimeinformationinsmallmoleculeidentification
AT williamsonjohn probabilisticframeworkforintegrationofmassspectrumandretentiontimeinformationinsmallmoleculeidentification
AT rousujuho probabilisticframeworkforintegrationofmassspectrumandretentiontimeinformationinsmallmoleculeidentification