Cargando…
Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints
MOTIVATION: Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a biological sample. Recently, we presente...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022630/ https://www.ncbi.nlm.nih.gov/pubmed/29949965 http://dx.doi.org/10.1093/bioinformatics/bty245 |
_version_ | 1783335719432880128 |
---|---|
author | Ludwig, Marcus Dührkop, Kai Böcker, Sebastian |
author_facet | Ludwig, Marcus Dührkop, Kai Böcker, Sebastian |
author_sort | Ludwig, Marcus |
collection | PubMed |
description | MOTIVATION: Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a biological sample. Recently, we presented CSI:FingerID for searching in molecular structure databases using tandem mass spectrometry data. CSI:FingerID predicts a molecular fingerprint that encodes the structure of the query compound, then uses this to search a molecular structure database such as PubChem. Scoring of the predicted query fingerprint and deterministic target fingerprints is carried out assuming independence between the molecular properties constituting the fingerprint. RESULTS: We present a scoring that takes into account dependencies between molecular properties. As before, we predict posterior probabilities of molecular properties using machine learning. Dependencies between molecular properties are modeled as a Bayesian tree network; the tree structure is estimated on the fly from the instance data. For each edge, we also estimate the expected covariance between the two random variables. For fixed marginal probabilities, we then estimate conditional probabilities using the known covariance. Now, the corrected posterior probability of each candidate can be computed, and candidates are ranked by this score. Modeling dependencies improves identification rates of CSI:FingerID by 2.85 percentage points. AVAILABILITY AND IMPLEMENTATION: The new scoring Bayesian (fixed tree) is integrated into SIRIUS 4.0 (https://bio.informatik.uni-jena.de/software/sirius/). |
format | Online Article Text |
id | pubmed-6022630 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-60226302018-07-10 Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints Ludwig, Marcus Dührkop, Kai Böcker, Sebastian Bioinformatics Ismb 2018–Intelligent Systems for Molecular Biology Proceedings MOTIVATION: Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a biological sample. Recently, we presented CSI:FingerID for searching in molecular structure databases using tandem mass spectrometry data. CSI:FingerID predicts a molecular fingerprint that encodes the structure of the query compound, then uses this to search a molecular structure database such as PubChem. Scoring of the predicted query fingerprint and deterministic target fingerprints is carried out assuming independence between the molecular properties constituting the fingerprint. RESULTS: We present a scoring that takes into account dependencies between molecular properties. As before, we predict posterior probabilities of molecular properties using machine learning. Dependencies between molecular properties are modeled as a Bayesian tree network; the tree structure is estimated on the fly from the instance data. For each edge, we also estimate the expected covariance between the two random variables. For fixed marginal probabilities, we then estimate conditional probabilities using the known covariance. Now, the corrected posterior probability of each candidate can be computed, and candidates are ranked by this score. Modeling dependencies improves identification rates of CSI:FingerID by 2.85 percentage points. AVAILABILITY AND IMPLEMENTATION: The new scoring Bayesian (fixed tree) is integrated into SIRIUS 4.0 (https://bio.informatik.uni-jena.de/software/sirius/). Oxford University Press 2018-07-01 2018-06-27 /pmc/articles/PMC6022630/ /pubmed/29949965 http://dx.doi.org/10.1093/bioinformatics/bty245 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb 2018–Intelligent Systems for Molecular Biology Proceedings Ludwig, Marcus Dührkop, Kai Böcker, Sebastian Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints |
title | Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints |
title_full | Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints |
title_fullStr | Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints |
title_full_unstemmed | Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints |
title_short | Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints |
title_sort | bayesian networks for mass spectrometric metabolite identification via molecular fingerprints |
topic | Ismb 2018–Intelligent Systems for Molecular Biology Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022630/ https://www.ncbi.nlm.nih.gov/pubmed/29949965 http://dx.doi.org/10.1093/bioinformatics/bty245 |
work_keys_str_mv | AT ludwigmarcus bayesiannetworksformassspectrometricmetaboliteidentificationviamolecularfingerprints AT duhrkopkai bayesiannetworksformassspectrometricmetaboliteidentificationviamolecularfingerprints AT bockersebastian bayesiannetworksformassspectrometricmetaboliteidentificationviamolecularfingerprints |