Cargando…
Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification
MOTIVATION: Identification of small molecules in a biological sample remains a major bottleneck in molecular biology, despite a decade of rapid development of computational approaches for predicting molecular structures using mass spectrometry (MS) data. Recently, there has been increasing interest...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8289373/ https://www.ncbi.nlm.nih.gov/pubmed/33244585 http://dx.doi.org/10.1093/bioinformatics/btaa998 |
_version_ | 1783724288555089920 |
---|---|
author | Bach, Eric Rogers, Simon Williamson, John Rousu, Juho |
author_facet | Bach, Eric Rogers, Simon Williamson, John Rousu, Juho |
author_sort | Bach, Eric |
collection | PubMed |
description | MOTIVATION: Identification of small molecules in a biological sample remains a major bottleneck in molecular biology, despite a decade of rapid development of computational approaches for predicting molecular structures using mass spectrometry (MS) data. Recently, there has been increasing interest in utilizing other information sources, such as liquid chromatography (LC) retention time (RT), to improve identifications solely based on MS information, such as precursor mass-per-charge and tandem mass spectrometry (MS(2)). RESULTS: We put forward a probabilistic modelling framework to integrate MS and RT data of multiple features in an LC-MS experiment. We model the MS measurements and all pairwise retention order information as a Markov random field and use efficient approximate inference for scoring and ranking potential molecular structures. Our experiments show improved identification accuracy by combining MS(2) data and retention orders using our approach, thereby outperforming state-of-the-art methods. Furthermore, we demonstrate the benefit of our model when only a subset of LC-MS features has MS(2) measurements available besides MS(1). AVAILABILITY AND IMPLEMENTATION: Software and data are freely available at https://github.com/aalto-ics-kepaco/msms_rt_score_integration. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-8289373 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-82893732021-07-20 Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification Bach, Eric Rogers, Simon Williamson, John Rousu, Juho Bioinformatics Original Papers MOTIVATION: Identification of small molecules in a biological sample remains a major bottleneck in molecular biology, despite a decade of rapid development of computational approaches for predicting molecular structures using mass spectrometry (MS) data. Recently, there has been increasing interest in utilizing other information sources, such as liquid chromatography (LC) retention time (RT), to improve identifications solely based on MS information, such as precursor mass-per-charge and tandem mass spectrometry (MS(2)). RESULTS: We put forward a probabilistic modelling framework to integrate MS and RT data of multiple features in an LC-MS experiment. We model the MS measurements and all pairwise retention order information as a Markov random field and use efficient approximate inference for scoring and ranking potential molecular structures. Our experiments show improved identification accuracy by combining MS(2) data and retention orders using our approach, thereby outperforming state-of-the-art methods. Furthermore, we demonstrate the benefit of our model when only a subset of LC-MS features has MS(2) measurements available besides MS(1). AVAILABILITY AND IMPLEMENTATION: Software and data are freely available at https://github.com/aalto-ics-kepaco/msms_rt_score_integration. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-11-27 /pmc/articles/PMC8289373/ /pubmed/33244585 http://dx.doi.org/10.1093/bioinformatics/btaa998 Text en © The Author(s) 2020. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Bach, Eric Rogers, Simon Williamson, John Rousu, Juho Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification |
title | Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification |
title_full | Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification |
title_fullStr | Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification |
title_full_unstemmed | Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification |
title_short | Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification |
title_sort | probabilistic framework for integration of mass spectrum and retention time information in small molecule identification |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8289373/ https://www.ncbi.nlm.nih.gov/pubmed/33244585 http://dx.doi.org/10.1093/bioinformatics/btaa998 |
work_keys_str_mv | AT bacheric probabilisticframeworkforintegrationofmassspectrumandretentiontimeinformationinsmallmoleculeidentification AT rogerssimon probabilisticframeworkforintegrationofmassspectrumandretentiontimeinformationinsmallmoleculeidentification AT williamsonjohn probabilisticframeworkforintegrationofmassspectrumandretentiontimeinformationinsmallmoleculeidentification AT rousujuho probabilisticframeworkforintegrationofmassspectrumandretentiontimeinformationinsmallmoleculeidentification |