Cargando…

Predicting Primary Biodegradation of Petroleum Hydrocarbons in Aquatic Systems: Integrating System and Molecular Structure Parameters using a Novel Machine‐Learning Framework

Quantitative structure–property relationship (QSPR) models for predicting primary biodegradation of petroleum hydrocarbons have been previously developed. These models use experimental data generated under widely varied conditions, the effects of which are not captured adequately within model formal...

Descripción completa

Detalles Bibliográficos
Autores principales: Davis, Craig Warren, Camenzuli, Louise, Redman, Aaron D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9320815/
https://www.ncbi.nlm.nih.gov/pubmed/35262215
http://dx.doi.org/10.1002/etc.5328
Descripción
Sumario:Quantitative structure–property relationship (QSPR) models for predicting primary biodegradation of petroleum hydrocarbons have been previously developed. These models use experimental data generated under widely varied conditions, the effects of which are not captured adequately within model formalisms. As a result, they exhibit variable predictive performance and are unable to incorporate the role of study design and test conditions on the assessment of environmental persistence. To address these limitations, a novel machine‐learning System‐Integrated Model (HC‐BioSIM) is presented, which integrates chemical structure and test system variability, leading to improved prediction of primary disappearance time (DT50) values for petroleum hydrocarbons in fresh and marine water. An expanded, highly curated database of 728 experimental DT50 values (181 unique hydrocarbon structures compiled from 13 primary sources) was used to develop and validate a supervised model tree machine‐learning model. Using relatively few parameters (6 system and 25 structural parameters), the model demonstrated significant improvement in predictive performance (root mean square error = 0.26, R (2) = 0.67) over existing QSPR models. The model also demonstrated improved accuracy of persistence (P) categorization (i.e., “Not P/P/vP”), with an accuracy of 96.8%, and false‐positive and ‐negative categorization rates of 0.4% and 2.7%, respectively. This significant improvement in DT50 prediction, and subsequent persistence categorization, validates the need for models that integrate experimental design and environmental system parameters into biodegradation and persistence assessment. Environ Toxicol Chem 2022;41:1359–1369. © 2022 ExxonMobil Biomedical Sciences, Inc. Environmental Toxicology and Chemistry published by Wiley Periodicals LLC on behalf of SETAC.