Cargando…
Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS
[Image: see text] With advances in machine learning (ML) techniques, the quantitative structure–activity relationship (QSAR) approach is becoming popular for evaluating chemicals. However, the QSAR approach requires that the chemical structure of the target compound is known and that it should be co...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2022
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9246259/ https://www.ncbi.nlm.nih.gov/pubmed/35700270 http://dx.doi.org/10.1021/acs.analchem.2c01667 |
_version_ | 1784738933355053056 |
---|---|
author | Zushi, Yasuyuki |
author_facet | Zushi, Yasuyuki |
author_sort | Zushi, Yasuyuki |
collection | PubMed |
description | [Image: see text] With advances in machine learning (ML) techniques, the quantitative structure–activity relationship (QSAR) approach is becoming popular for evaluating chemicals. However, the QSAR approach requires that the chemical structure of the target compound is known and that it should be convertible to molecular descriptors. These requirements lead to limitations in predicting the properties and toxicities of chemicals distributed in the environment as in the PubChem database; the structural information on only 14% of compounds is available. This study proposes a new ML-based QSAR approach that can predict the properties and toxicities of compounds using analytical descriptors of mass spectrum and retention index obtained via gas chromatography–mass spectrometry without requiring exact structural information. The model was developed based on the XGBoost ML method. The root-mean-square errors (RMSEs) for log K(o-w), log (molecular weight), melting point, boiling point, log (vapor pressure), log (water solubility), log (LD(50)) (rat, oral), and log (LD(50)) (mouse, oral) are 0.97, 0.052, 51, 23, 0.74, 1.1, 0.74, and 0.6, respectively. The model performed well on a chemical standard mixture measurement, with similar results to those of model validation. It also performed well on a measurement of contaminated oil with spectral deconvolution. These results indicate that the model is suitable for investigating unknown-structured chemicals detected in measurements. Any online user can execute the model through a web application named Detective-QSAR (http://www.mixture-platform.net/Detective_QSAR_Med_Open/). The analytical descriptor-based approach is expected to create new opportunities for the evaluation of unknown chemicals around us. |
format | Online Article Text |
id | pubmed-9246259 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-92462592022-07-01 Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS Zushi, Yasuyuki Anal Chem [Image: see text] With advances in machine learning (ML) techniques, the quantitative structure–activity relationship (QSAR) approach is becoming popular for evaluating chemicals. However, the QSAR approach requires that the chemical structure of the target compound is known and that it should be convertible to molecular descriptors. These requirements lead to limitations in predicting the properties and toxicities of chemicals distributed in the environment as in the PubChem database; the structural information on only 14% of compounds is available. This study proposes a new ML-based QSAR approach that can predict the properties and toxicities of compounds using analytical descriptors of mass spectrum and retention index obtained via gas chromatography–mass spectrometry without requiring exact structural information. The model was developed based on the XGBoost ML method. The root-mean-square errors (RMSEs) for log K(o-w), log (molecular weight), melting point, boiling point, log (vapor pressure), log (water solubility), log (LD(50)) (rat, oral), and log (LD(50)) (mouse, oral) are 0.97, 0.052, 51, 23, 0.74, 1.1, 0.74, and 0.6, respectively. The model performed well on a chemical standard mixture measurement, with similar results to those of model validation. It also performed well on a measurement of contaminated oil with spectral deconvolution. These results indicate that the model is suitable for investigating unknown-structured chemicals detected in measurements. Any online user can execute the model through a web application named Detective-QSAR (http://www.mixture-platform.net/Detective_QSAR_Med_Open/). The analytical descriptor-based approach is expected to create new opportunities for the evaluation of unknown chemicals around us. American Chemical Society 2022-06-14 2022-06-28 /pmc/articles/PMC9246259/ /pubmed/35700270 http://dx.doi.org/10.1021/acs.analchem.2c01667 Text en © 2022 The Author. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Zushi, Yasuyuki Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC–MS |
title | Direct Prediction of Physicochemical Properties and
Toxicities of Chemicals from Analytical Descriptors by GC–MS |
title_full | Direct Prediction of Physicochemical Properties and
Toxicities of Chemicals from Analytical Descriptors by GC–MS |
title_fullStr | Direct Prediction of Physicochemical Properties and
Toxicities of Chemicals from Analytical Descriptors by GC–MS |
title_full_unstemmed | Direct Prediction of Physicochemical Properties and
Toxicities of Chemicals from Analytical Descriptors by GC–MS |
title_short | Direct Prediction of Physicochemical Properties and
Toxicities of Chemicals from Analytical Descriptors by GC–MS |
title_sort | direct prediction of physicochemical properties and
toxicities of chemicals from analytical descriptors by gc–ms |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9246259/ https://www.ncbi.nlm.nih.gov/pubmed/35700270 http://dx.doi.org/10.1021/acs.analchem.2c01667 |
work_keys_str_mv | AT zushiyasuyuki directpredictionofphysicochemicalpropertiesandtoxicitiesofchemicalsfromanalyticaldescriptorsbygcms |