Cargando…

Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints

MOTIVATION: Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a biological sample. Recently, we presente...

Descripción completa

Detalles Bibliográficos
Autores principales: Ludwig, Marcus, Dührkop, Kai, Böcker, Sebastian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022630/
https://www.ncbi.nlm.nih.gov/pubmed/29949965
http://dx.doi.org/10.1093/bioinformatics/bty245
_version_ 1783335719432880128
author Ludwig, Marcus
Dührkop, Kai
Böcker, Sebastian
author_facet Ludwig, Marcus
Dührkop, Kai
Böcker, Sebastian
author_sort Ludwig, Marcus
collection PubMed
description MOTIVATION: Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a biological sample. Recently, we presented CSI:FingerID for searching in molecular structure databases using tandem mass spectrometry data. CSI:FingerID predicts a molecular fingerprint that encodes the structure of the query compound, then uses this to search a molecular structure database such as PubChem. Scoring of the predicted query fingerprint and deterministic target fingerprints is carried out assuming independence between the molecular properties constituting the fingerprint. RESULTS: We present a scoring that takes into account dependencies between molecular properties. As before, we predict posterior probabilities of molecular properties using machine learning. Dependencies between molecular properties are modeled as a Bayesian tree network; the tree structure is estimated on the fly from the instance data. For each edge, we also estimate the expected covariance between the two random variables. For fixed marginal probabilities, we then estimate conditional probabilities using the known covariance. Now, the corrected posterior probability of each candidate can be computed, and candidates are ranked by this score. Modeling dependencies improves identification rates of CSI:FingerID by 2.85 percentage points. AVAILABILITY AND IMPLEMENTATION: The new scoring Bayesian (fixed tree) is integrated into SIRIUS 4.0 (https://bio.informatik.uni-jena.de/software/sirius/).
format Online
Article
Text
id pubmed-6022630
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-60226302018-07-10 Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints Ludwig, Marcus Dührkop, Kai Böcker, Sebastian Bioinformatics Ismb 2018–Intelligent Systems for Molecular Biology Proceedings MOTIVATION: Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a biological sample. Recently, we presented CSI:FingerID for searching in molecular structure databases using tandem mass spectrometry data. CSI:FingerID predicts a molecular fingerprint that encodes the structure of the query compound, then uses this to search a molecular structure database such as PubChem. Scoring of the predicted query fingerprint and deterministic target fingerprints is carried out assuming independence between the molecular properties constituting the fingerprint. RESULTS: We present a scoring that takes into account dependencies between molecular properties. As before, we predict posterior probabilities of molecular properties using machine learning. Dependencies between molecular properties are modeled as a Bayesian tree network; the tree structure is estimated on the fly from the instance data. For each edge, we also estimate the expected covariance between the two random variables. For fixed marginal probabilities, we then estimate conditional probabilities using the known covariance. Now, the corrected posterior probability of each candidate can be computed, and candidates are ranked by this score. Modeling dependencies improves identification rates of CSI:FingerID by 2.85 percentage points. AVAILABILITY AND IMPLEMENTATION: The new scoring Bayesian (fixed tree) is integrated into SIRIUS 4.0 (https://bio.informatik.uni-jena.de/software/sirius/). Oxford University Press 2018-07-01 2018-06-27 /pmc/articles/PMC6022630/ /pubmed/29949965 http://dx.doi.org/10.1093/bioinformatics/bty245 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2018–Intelligent Systems for Molecular Biology Proceedings
Ludwig, Marcus
Dührkop, Kai
Böcker, Sebastian
Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints
title Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints
title_full Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints
title_fullStr Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints
title_full_unstemmed Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints
title_short Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints
title_sort bayesian networks for mass spectrometric metabolite identification via molecular fingerprints
topic Ismb 2018–Intelligent Systems for Molecular Biology Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022630/
https://www.ncbi.nlm.nih.gov/pubmed/29949965
http://dx.doi.org/10.1093/bioinformatics/bty245
work_keys_str_mv AT ludwigmarcus bayesiannetworksformassspectrometricmetaboliteidentificationviamolecularfingerprints
AT duhrkopkai bayesiannetworksformassspectrometricmetaboliteidentificationviamolecularfingerprints
AT bockersebastian bayesiannetworksformassspectrometricmetaboliteidentificationviamolecularfingerprints