Cargando…

The Bigger Fish: A Comparison of Meta-Learning QSAR Models on Low-Resourced Aquatic Toxicity Regression Tasks

[Image: see text] Toxicological information as needed for risk assessments of chemical compounds is often sparse. Unfortunately, gathering new toxicological information experimentally often involves animal testing. Simulated alternatives, e.g., quantitative structure–activity relationship (QSAR) mod...

Descripción completa

Detalles Bibliográficos
Autores principales: Schlender, Thalea, Viljanen, Markus, van Rijn, Jan N., Mohr, Felix, Peijnenburg, Willie JGM., Hoos, Holger H., Rorije, Emiel, Wong, Albert
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10666535/
https://www.ncbi.nlm.nih.gov/pubmed/37315216
http://dx.doi.org/10.1021/acs.est.3c00334
Descripción
Sumario:[Image: see text] Toxicological information as needed for risk assessments of chemical compounds is often sparse. Unfortunately, gathering new toxicological information experimentally often involves animal testing. Simulated alternatives, e.g., quantitative structure–activity relationship (QSAR) models, are preferred to infer the toxicity of new compounds. Aquatic toxicity data collections consist of many related tasks—each predicting the toxicity of new compounds on a given species. Since many of these tasks are inherently low-resource, i.e., involve few associated compounds, this is challenging. Meta-learning is a subfield of artificial intelligence that can lead to more accurate models by enabling the utilization of information across tasks. In our work, we benchmark various state-of-the-art meta-learning techniques for building QSAR models, focusing on knowledge sharing between species. Specifically, we employ and compare transformational machine learning, model-agnostic meta-learning, fine-tuning, and multi-task models. Our experiments show that established knowledge-sharing techniques outperform single-task approaches. We recommend the use of multi-task random forest models for aquatic toxicity modeling, which matched or exceeded the performance of other approaches and robustly produced good results in the low-resource settings we studied. This model functions on a species level, predicting toxicity for multiple species across various phyla, with flexible exposure duration and on a large chemical applicability domain.