Prediction of bioconcentration factors in fish and invertebrates using machine learning

Miller, T.H, Gallidabino, M.D, MacRae, J.R, Owen, S.F, Bury, Nic and Barron, L.P (2018) Prediction of bioconcentration factors in fish and invertebrates using machine learning. Science of the Total Environment, 648. pp. 80-89. ISSN 0048-9697

2019 Miller et al STOTEN 648 80-89.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview


The application of machine learning has recently gained interest from ecotoxicological fields for its ability to
model and predict chemical and/or biological processes, such as the prediction of bioconcentration. However,
comparison of different models and the prediction of bioconcentration in invertebrates has not been previously
evaluated. A comparison of 24 linear and machine learning models is presented herein for the prediction of
bioconcentration in fish and important factors that influenced accumulation identified. R2 and rootmean square
error (RMSE) for the test data (n=110 cases) ranged from0.23–0.73 and 0.34–1.20, respectively. Model performance
was critically assessed with neural networks and tree-based learners showing the best performance. An
optimised 4-layer multi-layer perceptron (14 descriptors) was selected for further testing. The model was applied
for cross-species prediction of bioconcentration in a freshwater invertebrate, Gammarus pulex. The model
for G. pulex showed good performancewith R2 of 0.99 and 0.93 for the verification and test data, respectively. Important
molecular descriptors determined to influence bioconcentration were molecular mass (MW), octanolwater
distribution coefficient (logD), topological polar surface area (TPSA) and number of nitrogen atoms (nN)
among others. Modelling of hazard criteria such as PBT, showed potential to replace the need for animal testing.
However, the use of machine learningmodels in the regulatory context has been minimal to date and is critically
discussed herein. The movement away from experimental estimations of accumulation to in silico modelling

Item Type: Article
Uncontrolled Keywords: ecotoxicological fields, biological processes, chemical processes, machine learning, Gammarus pulex,
Subjects: Q Science > Q Science (General)
Q Science > QH Natural history
Q Science > QH Natural history > QH301 Biology
Q Science > QL Zoology
Divisions: Faculty of Health & Science > Department of Science & Technology
Depositing User: Nic Bury
Date Deposited: 21 Aug 2018 14:11
Last Modified: 21 Aug 2018 14:11

Actions (login required)

View Item View Item


Downloads per month over past year