Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Machine learning-based prediction of fish acute mortality: implementation, interpretation, and regulatory relevance
 
research article

Machine learning-based prediction of fish acute mortality: implementation, interpretation, and regulatory relevance

Gasser, Lilian  
•
Schür, Christoph
•
Perez-Cruz, Fernando  
Show more
June 3, 2024
Environmental Science: Advances

Regulation of chemicals requires knowledge of their toxicological effects on a large number of species, which has traditionally been acquired through in vivo testing. The recent effort to find alternatives based on machine learning, however, has not focused on guaranteeing transparency, comparability and reproducibility, which makes it difficult to assess advantages and disadvantages of these methods. Also, comparable baseline performances are needed. In this study, we trained regression models on the ADORE “t-F2F” challenge proposed in [Schür et al., Nature Scientific data, 2023] to predict acute mortality, measured as LC50 (lethal concentration 50), of organic compounds on fishes. We trained LASSO, random forest (RF), XGBoost, Gaussian process (GP) regression models, and found a series of aspects that are stable across models: (i) using mass or molar concentrations does not affect performances; (ii) the performances are only weakly dependent on the molecular representations of the chemicals, but (iii) strongly on how the data is split. Overall, the tree-based models RF and XGBoost performed best and we were able to predict the log10-transformed LC50 with a root mean square error of 0.90, which corresponds to an order of magnitude on the original LC50 scale. On a local level, on the other hand, the models are not able to consistently predict the toxicity of individual chemicals accurately enough. Predictions for single chemicals are mostly influenced by a few chemical properties while taxonomic traits are not captured sufficiently by the models. We discuss technical and conceptual improvements for these challenges to enhance the suitability of in silico methods to environmental hazard assessment. Accordingly, this work showcases state-of-the-art models and contributes to the ongoing discussion on regulatory integration.

  • Files
  • Details
  • Metrics
Type
research article
DOI
10.1039/d4va00072b
Scopus ID

2-s2.0-85196284359

Author(s)
Gasser, Lilian  

École Polytechnique Fédérale de Lausanne

Schür, Christoph

Eawag - Swiss Federal Institute of Aquatic Science and Technology

Perez-Cruz, Fernando  

École Polytechnique Fédérale de Lausanne

Schirmer, Kristin

Eawag - Swiss Federal Institute of Aquatic Science and Technology

Baity-Jesi, Marco

Eawag - Swiss Federal Institute of Aquatic Science and Technology

Date Issued

2024-06-03

Published in
Environmental Science: Advances
Volume

3

Issue

8

Start page

1124

End page

1138

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
SDSC-GE  
FunderFunding(s)Grant NumberGrant URL

European Union's Horizon Europe research and innovation program

101057014

Available on Infoscience
January 21, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/243142
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés