Cargando…

Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics

Metabolite structure identification remains a significant challenge in nontargeted metabolomics research. One commonly used strategy relies on searching biochemical databases using exact mass. However, this approach fails when the database does not contain the unknown metabolite (i.e., for unknown-u...

Descripción completa

Detalles Bibliográficos
Autores principales: Menikarachchi, Lochana C., Dubey, Ritvik, Hill, Dennis W., Brush, Daniel N., Grant, David F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4931548/
https://www.ncbi.nlm.nih.gov/pubmed/27258318
http://dx.doi.org/10.3390/metabo6020017
_version_ 1782440919206723584
author Menikarachchi, Lochana C.
Dubey, Ritvik
Hill, Dennis W.
Brush, Daniel N.
Grant, David F.
author_facet Menikarachchi, Lochana C.
Dubey, Ritvik
Hill, Dennis W.
Brush, Daniel N.
Grant, David F.
author_sort Menikarachchi, Lochana C.
collection PubMed
description Metabolite structure identification remains a significant challenge in nontargeted metabolomics research. One commonly used strategy relies on searching biochemical databases using exact mass. However, this approach fails when the database does not contain the unknown metabolite (i.e., for unknown-unknowns). For these cases, constrained structure generation with combinatorial structure generators provides a potential option. Here we evaluated structure generation constraints based on the specification of: (1) substructures required (i.e., seed structures); (2) substructures not allowed; and (3) filters to remove incorrect structures. Our approach (database assisted structure identification, DASI) used predictive models in MolFind to find candidate structures with chemical and physical properties similar to the unknown. These candidates were then used for seed structure generation using eight different structure generation algorithms. One algorithm was able to generate correct seed structures for 21/39 test compounds. Eleven of these seed structures were large enough to constrain the combinatorial structure generator to fewer than 100,000 structures. In 35/39 cases, at least one algorithm was able to generate a correct seed structure. The DASI method has several limitations and will require further experimental validation and optimization. At present, it seems most useful for identifying the structure of unknown-unknowns with molecular weights <200 Da.
format Online
Article
Text
id pubmed-4931548
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-49315482016-07-08 Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics Menikarachchi, Lochana C. Dubey, Ritvik Hill, Dennis W. Brush, Daniel N. Grant, David F. Metabolites Article Metabolite structure identification remains a significant challenge in nontargeted metabolomics research. One commonly used strategy relies on searching biochemical databases using exact mass. However, this approach fails when the database does not contain the unknown metabolite (i.e., for unknown-unknowns). For these cases, constrained structure generation with combinatorial structure generators provides a potential option. Here we evaluated structure generation constraints based on the specification of: (1) substructures required (i.e., seed structures); (2) substructures not allowed; and (3) filters to remove incorrect structures. Our approach (database assisted structure identification, DASI) used predictive models in MolFind to find candidate structures with chemical and physical properties similar to the unknown. These candidates were then used for seed structure generation using eight different structure generation algorithms. One algorithm was able to generate correct seed structures for 21/39 test compounds. Eleven of these seed structures were large enough to constrain the combinatorial structure generator to fewer than 100,000 structures. In 35/39 cases, at least one algorithm was able to generate a correct seed structure. The DASI method has several limitations and will require further experimental validation and optimization. At present, it seems most useful for identifying the structure of unknown-unknowns with molecular weights <200 Da. MDPI 2016-05-31 /pmc/articles/PMC4931548/ /pubmed/27258318 http://dx.doi.org/10.3390/metabo6020017 Text en © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Menikarachchi, Lochana C.
Dubey, Ritvik
Hill, Dennis W.
Brush, Daniel N.
Grant, David F.
Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics
title Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics
title_full Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics
title_fullStr Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics
title_full_unstemmed Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics
title_short Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics
title_sort development of database assisted structure identification (dasi) methods for nontargeted metabolomics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4931548/
https://www.ncbi.nlm.nih.gov/pubmed/27258318
http://dx.doi.org/10.3390/metabo6020017
work_keys_str_mv AT menikarachchilochanac developmentofdatabaseassistedstructureidentificationdasimethodsfornontargetedmetabolomics
AT dubeyritvik developmentofdatabaseassistedstructureidentificationdasimethodsfornontargetedmetabolomics
AT hilldennisw developmentofdatabaseassistedstructureidentificationdasimethodsfornontargetedmetabolomics
AT brushdanieln developmentofdatabaseassistedstructureidentificationdasimethodsfornontargetedmetabolomics
AT grantdavidf developmentofdatabaseassistedstructureidentificationdasimethodsfornontargetedmetabolomics