On the Conflict Between Robustness and Learning in Collaborative Machine Learning
Collaborative Machine Learning (CML) allows participants to jointly train a machine learning model while keeping their training data private. In many scenarios where CML is seen as the solution to privacy issues, such as health-related applications, safety is also a primary concern. To ensure that CML processes produce models that output correct and reliable decisions even in the presence of potentially untrusted participants, researchers propose to use robust aggregators to filter out malicious contributions that negatively influence the training process. In this paper, we prove that the two prevalent forms of robust aggregators in the literature cannot eliminate the risk of compromise without preventing learning: in order to learn from collaboration, participants must always accept the risk of being the subject of harmful adversarial manipulation. Therefore, these robust aggregators are unsuitable for high-stake applications such as health-related or autonomous driving in which errors can result in physical harm. We empirically demonstrate the correctness of our theoretical findings on a selection of existing robust aggregators and relevant applications, including end-to-end results where we show that using existing robust aggregators can lead to an adversary can cause incorrect medical diagnosis or can cause self-driving cars to miss turns.
École Polytechnique Fédérale de Lausanne
2025-06-16
979-8-3315-2236-0
2171
2189
REVIEWED
EPFL
Event name | Event acronym | Event place | Event date |
SP 2025 | San Francisco, CA, USA | 2025-05-12 - 2025-05-15 | |