Cargando…

Prediction of bioconcentration factors in fish and invertebrates using machine learning

The application of machine learning has recently gained interest from ecotoxicological fields for its ability to model and predict chemical and/or biological processes, such as the prediction of bioconcentration. However, comparison of different models and the prediction of bioconcentration in inver...

Descripción completa

Detalles Bibliográficos
Autores principales: Miller, Thomas H., Gallidabino, Matteo D., MacRae, James I., Owen, Stewart F., Bury, Nicolas R., Barron, Leon P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6234108/
https://www.ncbi.nlm.nih.gov/pubmed/30114591
http://dx.doi.org/10.1016/j.scitotenv.2018.08.122
_version_ 1783370635954618368
author Miller, Thomas H.
Gallidabino, Matteo D.
MacRae, James I.
Owen, Stewart F.
Bury, Nicolas R.
Barron, Leon P.
author_facet Miller, Thomas H.
Gallidabino, Matteo D.
MacRae, James I.
Owen, Stewart F.
Bury, Nicolas R.
Barron, Leon P.
author_sort Miller, Thomas H.
collection PubMed
description The application of machine learning has recently gained interest from ecotoxicological fields for its ability to model and predict chemical and/or biological processes, such as the prediction of bioconcentration. However, comparison of different models and the prediction of bioconcentration in invertebrates has not been previously evaluated. A comparison of 24 linear and machine learning models is presented herein for the prediction of bioconcentration in fish and important factors that influenced accumulation identified. R(2) and root mean square error (RMSE) for the test data (n = 110 cases) ranged from 0.23–0.73 and 0.34–1.20, respectively. Model performance was critically assessed with neural networks and tree-based learners showing the best performance. An optimised 4-layer multi-layer perceptron (14 descriptors) was selected for further testing. The model was applied for cross-species prediction of bioconcentration in a freshwater invertebrate, Gammarus pulex. The model for G. pulex showed good performance with R(2) of 0.99 and 0.93 for the verification and test data, respectively. Important molecular descriptors determined to influence bioconcentration were molecular mass (MW), octanol-water distribution coefficient (logD), topological polar surface area (TPSA) and number of nitrogen atoms (nN) among others. Modelling of hazard criteria such as PBT, showed potential to replace the need for animal testing. However, the use of machine learning models in the regulatory context has been minimal to date and is critically discussed herein. The movement away from experimental estimations of accumulation to in silico modelling would enable rapid prioritisation of contaminants that may pose a risk to environmental health and the food chain.
format Online
Article
Text
id pubmed-6234108
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-62341082019-01-15 Prediction of bioconcentration factors in fish and invertebrates using machine learning Miller, Thomas H. Gallidabino, Matteo D. MacRae, James I. Owen, Stewart F. Bury, Nicolas R. Barron, Leon P. Sci Total Environ Article The application of machine learning has recently gained interest from ecotoxicological fields for its ability to model and predict chemical and/or biological processes, such as the prediction of bioconcentration. However, comparison of different models and the prediction of bioconcentration in invertebrates has not been previously evaluated. A comparison of 24 linear and machine learning models is presented herein for the prediction of bioconcentration in fish and important factors that influenced accumulation identified. R(2) and root mean square error (RMSE) for the test data (n = 110 cases) ranged from 0.23–0.73 and 0.34–1.20, respectively. Model performance was critically assessed with neural networks and tree-based learners showing the best performance. An optimised 4-layer multi-layer perceptron (14 descriptors) was selected for further testing. The model was applied for cross-species prediction of bioconcentration in a freshwater invertebrate, Gammarus pulex. The model for G. pulex showed good performance with R(2) of 0.99 and 0.93 for the verification and test data, respectively. Important molecular descriptors determined to influence bioconcentration were molecular mass (MW), octanol-water distribution coefficient (logD), topological polar surface area (TPSA) and number of nitrogen atoms (nN) among others. Modelling of hazard criteria such as PBT, showed potential to replace the need for animal testing. However, the use of machine learning models in the regulatory context has been minimal to date and is critically discussed herein. The movement away from experimental estimations of accumulation to in silico modelling would enable rapid prioritisation of contaminants that may pose a risk to environmental health and the food chain. Elsevier 2019-01-15 /pmc/articles/PMC6234108/ /pubmed/30114591 http://dx.doi.org/10.1016/j.scitotenv.2018.08.122 Text en © 2018 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Miller, Thomas H.
Gallidabino, Matteo D.
MacRae, James I.
Owen, Stewart F.
Bury, Nicolas R.
Barron, Leon P.
Prediction of bioconcentration factors in fish and invertebrates using machine learning
title Prediction of bioconcentration factors in fish and invertebrates using machine learning
title_full Prediction of bioconcentration factors in fish and invertebrates using machine learning
title_fullStr Prediction of bioconcentration factors in fish and invertebrates using machine learning
title_full_unstemmed Prediction of bioconcentration factors in fish and invertebrates using machine learning
title_short Prediction of bioconcentration factors in fish and invertebrates using machine learning
title_sort prediction of bioconcentration factors in fish and invertebrates using machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6234108/
https://www.ncbi.nlm.nih.gov/pubmed/30114591
http://dx.doi.org/10.1016/j.scitotenv.2018.08.122
work_keys_str_mv AT millerthomash predictionofbioconcentrationfactorsinfishandinvertebratesusingmachinelearning
AT gallidabinomatteod predictionofbioconcentrationfactorsinfishandinvertebratesusingmachinelearning
AT macraejamesi predictionofbioconcentrationfactorsinfishandinvertebratesusingmachinelearning
AT owenstewartf predictionofbioconcentrationfactorsinfishandinvertebratesusingmachinelearning
AT burynicolasr predictionofbioconcentrationfactorsinfishandinvertebratesusingmachinelearning
AT barronleonp predictionofbioconcentrationfactorsinfishandinvertebratesusingmachinelearning