Cargando…

Data Science Approach to Estimate Enthalpy of Formation of Cyclic Hydrocarbons

[Image: see text] In spite of increasing importance of cyclic hydrocarbons in various chemical systems, studies on the fundamental properties of these compounds, such as enthalpy of formation, are still scarce. One of the reasons for this is the fact that the estimation of the thermodynamic properti...

Descripción completa

Detalles Bibliográficos
Autores principales: Yalamanchi, Kiran K., Monge-Palacios, M., van Oudenhoven, Vincent C. O., Gao, Xin, Sarathy, S. Mani
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2020
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7458419/
https://www.ncbi.nlm.nih.gov/pubmed/32648745
http://dx.doi.org/10.1021/acs.jpca.0c02785
_version_ 1783576193043267584
author Yalamanchi, Kiran K.
Monge-Palacios, M.
van Oudenhoven, Vincent C. O.
Gao, Xin
Sarathy, S. Mani
author_facet Yalamanchi, Kiran K.
Monge-Palacios, M.
van Oudenhoven, Vincent C. O.
Gao, Xin
Sarathy, S. Mani
author_sort Yalamanchi, Kiran K.
collection PubMed
description [Image: see text] In spite of increasing importance of cyclic hydrocarbons in various chemical systems, studies on the fundamental properties of these compounds, such as enthalpy of formation, are still scarce. One of the reasons for this is the fact that the estimation of the thermodynamic properties of cyclic hydrocarbon species via cost-effective computational approaches, such as group additivity (GA), has several limitations and challenges. In this study, a machine learning (ML) approach is proposed using a support vector regression (SVR) algorithm to predict the standard enthalpy of formation of cyclic hydrocarbon species. The model is developed based on a thoroughly selected dataset of accurate experimental values of 192 species collected from the literature. The molecular descriptors used as input to the SVR are calculated via alvaDesc software, which computes in total 5255 features classified into 30 categories. The developed SVR model has an average error of approximately 10 kJ/mol. In comparison, the SVR model outperforms the GA approach for complex molecules and can be therefore proposed as a novel data-driven approach to estimate enthalpy values for complex cyclic species. A sensitivity analysis is also conducted to examine the relevant features that play a role in affecting the standard enthalpy of formation of cyclic species. Our species dataset is expected to be updated and expanded as new data are available to develop a more accurate SVR model with broader applicability.
format Online
Article
Text
id pubmed-7458419
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-74584192020-09-01 Data Science Approach to Estimate Enthalpy of Formation of Cyclic Hydrocarbons Yalamanchi, Kiran K. Monge-Palacios, M. van Oudenhoven, Vincent C. O. Gao, Xin Sarathy, S. Mani J Phys Chem A [Image: see text] In spite of increasing importance of cyclic hydrocarbons in various chemical systems, studies on the fundamental properties of these compounds, such as enthalpy of formation, are still scarce. One of the reasons for this is the fact that the estimation of the thermodynamic properties of cyclic hydrocarbon species via cost-effective computational approaches, such as group additivity (GA), has several limitations and challenges. In this study, a machine learning (ML) approach is proposed using a support vector regression (SVR) algorithm to predict the standard enthalpy of formation of cyclic hydrocarbon species. The model is developed based on a thoroughly selected dataset of accurate experimental values of 192 species collected from the literature. The molecular descriptors used as input to the SVR are calculated via alvaDesc software, which computes in total 5255 features classified into 30 categories. The developed SVR model has an average error of approximately 10 kJ/mol. In comparison, the SVR model outperforms the GA approach for complex molecules and can be therefore proposed as a novel data-driven approach to estimate enthalpy values for complex cyclic species. A sensitivity analysis is also conducted to examine the relevant features that play a role in affecting the standard enthalpy of formation of cyclic species. Our species dataset is expected to be updated and expanded as new data are available to develop a more accurate SVR model with broader applicability. American Chemical Society 2020-07-10 2020-08-06 /pmc/articles/PMC7458419/ /pubmed/32648745 http://dx.doi.org/10.1021/acs.jpca.0c02785 Text en Copyright © 2020 American Chemical Society This is an open access article published under a Creative Commons Attribution (CC-BY) License (http://pubs.acs.org/page/policy/authorchoice_ccby_termsofuse.html) , which permits unrestricted use, distribution and reproduction in any medium, provided the author and source are cited.
spellingShingle Yalamanchi, Kiran K.
Monge-Palacios, M.
van Oudenhoven, Vincent C. O.
Gao, Xin
Sarathy, S. Mani
Data Science Approach to Estimate Enthalpy of Formation of Cyclic Hydrocarbons
title Data Science Approach to Estimate Enthalpy of Formation of Cyclic Hydrocarbons
title_full Data Science Approach to Estimate Enthalpy of Formation of Cyclic Hydrocarbons
title_fullStr Data Science Approach to Estimate Enthalpy of Formation of Cyclic Hydrocarbons
title_full_unstemmed Data Science Approach to Estimate Enthalpy of Formation of Cyclic Hydrocarbons
title_short Data Science Approach to Estimate Enthalpy of Formation of Cyclic Hydrocarbons
title_sort data science approach to estimate enthalpy of formation of cyclic hydrocarbons
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7458419/
https://www.ncbi.nlm.nih.gov/pubmed/32648745
http://dx.doi.org/10.1021/acs.jpca.0c02785
work_keys_str_mv AT yalamanchikirank datascienceapproachtoestimateenthalpyofformationofcyclichydrocarbons
AT mongepalaciosm datascienceapproachtoestimateenthalpyofformationofcyclichydrocarbons
AT vanoudenhovenvincentco datascienceapproachtoestimateenthalpyofformationofcyclichydrocarbons
AT gaoxin datascienceapproachtoestimateenthalpyofformationofcyclichydrocarbons
AT sarathysmani datascienceapproachtoestimateenthalpyofformationofcyclichydrocarbons