Cargando…

Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS

[Image: see text] With advances in machine learning (ML) techniques, the quantitative structure–activity relationship (QSAR) approach is becoming popular for evaluating chemicals. However, the QSAR approach requires that the chemical structure of the target compound is known and that it should be co...

Descripción completa

Detalles Bibliográficos
Autor principal: Zushi, Yasuyuki
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9246259/
https://www.ncbi.nlm.nih.gov/pubmed/35700270
http://dx.doi.org/10.1021/acs.analchem.2c01667
_version_ 1784738933355053056
author Zushi, Yasuyuki
author_facet Zushi, Yasuyuki
author_sort Zushi, Yasuyuki
collection PubMed
description [Image: see text] With advances in machine learning (ML) techniques, the quantitative structure–activity relationship (QSAR) approach is becoming popular for evaluating chemicals. However, the QSAR approach requires that the chemical structure of the target compound is known and that it should be convertible to molecular descriptors. These requirements lead to limitations in predicting the properties and toxicities of chemicals distributed in the environment as in the PubChem database; the structural information on only 14% of compounds is available. This study proposes a new ML-based QSAR approach that can predict the properties and toxicities of compounds using analytical descriptors of mass spectrum and retention index obtained via gas chromatography–mass spectrometry without requiring exact structural information. The model was developed based on the XGBoost ML method. The root-mean-square errors (RMSEs) for log K(o-w), log (molecular weight), melting point, boiling point, log (vapor pressure), log (water solubility), log (LD(50)) (rat, oral), and log (LD(50)) (mouse, oral) are 0.97, 0.052, 51, 23, 0.74, 1.1, 0.74, and 0.6, respectively. The model performed well on a chemical standard mixture measurement, with similar results to those of model validation. It also performed well on a measurement of contaminated oil with spectral deconvolution. These results indicate that the model is suitable for investigating unknown-structured chemicals detected in measurements. Any online user can execute the model through a web application named Detective-QSAR (http://www.mixture-platform.net/Detective_QSAR_Med_Open/). The analytical descriptor-based approach is expected to create new opportunities for the evaluation of unknown chemicals around us.
format Online
Article
Text
id pubmed-9246259
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-92462592022-07-01 Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS Zushi, Yasuyuki Anal Chem [Image: see text] With advances in machine learning (ML) techniques, the quantitative structure–activity relationship (QSAR) approach is becoming popular for evaluating chemicals. However, the QSAR approach requires that the chemical structure of the target compound is known and that it should be convertible to molecular descriptors. These requirements lead to limitations in predicting the properties and toxicities of chemicals distributed in the environment as in the PubChem database; the structural information on only 14% of compounds is available. This study proposes a new ML-based QSAR approach that can predict the properties and toxicities of compounds using analytical descriptors of mass spectrum and retention index obtained via gas chromatography–mass spectrometry without requiring exact structural information. The model was developed based on the XGBoost ML method. The root-mean-square errors (RMSEs) for log K(o-w), log (molecular weight), melting point, boiling point, log (vapor pressure), log (water solubility), log (LD(50)) (rat, oral), and log (LD(50)) (mouse, oral) are 0.97, 0.052, 51, 23, 0.74, 1.1, 0.74, and 0.6, respectively. The model performed well on a chemical standard mixture measurement, with similar results to those of model validation. It also performed well on a measurement of contaminated oil with spectral deconvolution. These results indicate that the model is suitable for investigating unknown-structured chemicals detected in measurements. Any online user can execute the model through a web application named Detective-QSAR (http://www.mixture-platform.net/Detective_QSAR_Med_Open/). The analytical descriptor-based approach is expected to create new opportunities for the evaluation of unknown chemicals around us. American Chemical Society 2022-06-14 2022-06-28 /pmc/articles/PMC9246259/ /pubmed/35700270 http://dx.doi.org/10.1021/acs.analchem.2c01667 Text en © 2022 The Author. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Zushi, Yasuyuki
Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS
title Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS
title_full Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS
title_fullStr Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS
title_full_unstemmed Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS
title_short Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS
title_sort direct prediction of physicochemical properties and toxicities of chemicals from analytical descriptors by gc–ms
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9246259/
https://www.ncbi.nlm.nih.gov/pubmed/35700270
http://dx.doi.org/10.1021/acs.analchem.2c01667
work_keys_str_mv AT zushiyasuyuki directpredictionofphysicochemicalpropertiesandtoxicitiesofchemicalsfromanalyticaldescriptorsbygcms