Publication:

The value of human data annotation for machine learning based anomaly detection in environmental systems

cris.lastimport.scopus

2024-08-09T12:29:08Z

cris.legacyId

290572

cris.virtual.parent-organization

IIE

cris.virtual.parent-organization

ENAC

cris.virtual.parent-organization

EPFL

cris.virtual.sciperId

303894

cris.virtual.unitId

12616

cris.virtual.unitManager

Beyer, Katrin

cris.virtualsource.author-scopus

b9e1728c-8b9f-46c0-a93d-2ad1508d830b

cris.virtualsource.department

b9e1728c-8b9f-46c0-a93d-2ad1508d830b

cris.virtualsource.orcid

b9e1728c-8b9f-46c0-a93d-2ad1508d830b

cris.virtualsource.parent-organization

b7d236a3-595d-4d9e-a2c7-45f495f14460

cris.virtualsource.parent-organization

b7d236a3-595d-4d9e-a2c7-45f495f14460

cris.virtualsource.parent-organization

b7d236a3-595d-4d9e-a2c7-45f495f14460

cris.virtualsource.parent-organization

b7d236a3-595d-4d9e-a2c7-45f495f14460

cris.virtualsource.rid

b9e1728c-8b9f-46c0-a93d-2ad1508d830b

cris.virtualsource.sciperId

b9e1728c-8b9f-46c0-a93d-2ad1508d830b

cris.virtualsource.unitId

b7d236a3-595d-4d9e-a2c7-45f495f14460

cris.virtualsource.unitManager

b7d236a3-595d-4d9e-a2c7-45f495f14460

datacite.rights

metadata-only

dc.contributor.author

Russo, Stefania

dc.contributor.author

Besmer, Michael D.

dc.contributor.author

Blumensaat, Frank

dc.contributor.author

Bouffard, Damien

dc.contributor.author

Disch, Andy

dc.contributor.author

Hammes, Frederik

dc.contributor.author

Hess, Angelika

dc.contributor.author

Lurig, Moritz

dc.contributor.author

Matthews, Blake

dc.contributor.author

Minaudo, Camille

dc.contributor.author

Morgenroth, Eberhard

dc.contributor.author

Tran-Khac, Viet

dc.contributor.author

Villez, Kris

dc.date.accessioned

2021-12-04T02:29:46

dc.date.available

2021-12-04T02:29:46

dc.date.created

2021-12-04

dc.date.issued

2021-11-01

dc.date.modified

2024-10-18T06:12:03.814180Z

dc.description.abstract

Anomaly detection is the process of identifying unexpected data samples in datasets. Automated anomaly detection is either performed using supervised machine learning models, which require a labelled dataset for their calibration, or unsupervised models, which do not require labels. While academic research has produced a vast array of tools and machine learning models for automated anomaly detection, the research community focused on environmental systems still lacks a comparative analysis that is simultaneously comprehensive, objective, and systematic. This knowledge gap is addressed for the first time in this study, where 15 different supervised and unsupervised anomaly detection models are evaluated on 5 different environmental datasets from engineered and natural aquatic systems. To this end, anomaly detection performance, labelling efforts, as well as the impact of model and algorithm tuning are taken into account. As a result, our analysis reveals the relative strengths and weaknesses of the different approaches in an objective manner without bias for any particular paradigm in machine learning. Most importantly, our results show that expert-based data annotation is extremely valuable for anomaly detection based on machine learning.

dc.description.sponsorship

APHYS

dc.identifier.doi

10.1016/j.watres.2021.117695

dc.identifier.isi

WOS:000713194100009

dc.identifier.uri

https://infoscience.epfl.ch/handle/20.500.14299/183659

dc.publisher

PERGAMON-ELSEVIER SCIENCE LTD

dc.publisher.place

Oxford

dc.relation.issn

0043-1354

dc.relation.issn

1879-2448

dc.relation.journal

Water Research

dc.source

WoS

dc.subject

Engineering, Environmental

dc.subject

Environmental Sciences

dc.subject

Water Resources

dc.subject

Engineering

dc.subject

Environmental Sciences & Ecology

dc.subject

machine learning

dc.subject

anomaly detection

dc.subject

environmental systems

dc.subject

labels

dc.subject

principal component analysis

dc.subject

sequencing batch reactor

dc.subject

fault-detection

dc.subject

water-quality

dc.subject

multivariate

dc.subject

regression

dc.subject

network

dc.title

The value of human data annotation for machine learning based anomaly detection in environmental systems

dc.type

text::journal::journal article::research article

dspace.entity.type

Publication

dspace.legacy.oai-identifier

oai:infoscience.epfl.ch:290572

epfl.curator.email

jorge.rodriguesdematos@epfl.ch

epfl.legacy.itemtype

Journal Articles

epfl.legacy.submissionform

ARTICLE

epfl.oai.currentset

ENAC

epfl.oai.currentset

article

epfl.oai.currentset

OpenAIREv4

epfl.peerreviewed

REVIEWED

epfl.publication.version

http://purl.org/coar/version/c_970fb48d4fbd8a85

epfl.writtenAt

EPFL

oaire.citation.articlenumber

117695

oaire.citation.volume

206

Files

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:

Collections