Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Impact of Redundancy on Resilience in Distributed Optimization and Learning
 
conference paper

Impact of Redundancy on Resilience in Distributed Optimization and Learning

Liu, Shuo
•
Gupta, Nirupam  
•
Vaidya, Nitin H.
January 1, 2023
Proceedings Of The 24Th International Conference On Distributed Computing And Networking, Icdcn 2023
24th International Conference on Distributed Computing and Networking (ICDCN)

This paper considers the problem of resilient distributed optimization and stochastic learning in a server-based architecture. The system comprises a server and multiple agents, where each agent has its own local cost function. The agents collaborate with the server to find a minimum of the aggregate of the local cost functions. In the context of stochastic learning, the local cost of an agent is the loss function computed over the data at that agent. In this paper, we consider this problem in a system wherein some of the agents may be Byzantine faulty and some of the agents may be slow (also called stragglers). In this setting, we investigate the conditions under which it is possible to obtain an "approximate" solution to the above problem. In particular, we introduce the notion of (f, r; epsilon)-resilience to characterize how well the true solution is approximated in the presence of up to f Byzantine faulty agents, and up to r slow agents (or stragglers) - smaller epsilon represents a better approximation. We also introduce a measure named (f, r; epsilon)-redundancy to characterize the redundancy in the cost functions of the agents. Greater redundancy allows for a better approximation when solving the problem of aggregate cost minimization.|In this paper, we constructively show (both theoretically and empirically) that (f, r; O(epsilon))-resilience can indeed be achieved in practice, given that the local cost functions are sufficiently redundant. Our empirical evaluation considers a distributed gradient descent (DGD)-based solution; for distributed learning in the presence of Byzantine and asynchronous agents, we also evaluate a distributed stochastic gradient descent (D-SGD)-based algorithm.

  • Details
  • Metrics
Type
conference paper
DOI
10.1145/3571306.3571393
Web of Science ID

WOS:001098722500010

Author(s)
Liu, Shuo
Gupta, Nirupam  
Vaidya, Nitin H.
Corporate authors
ASSOC COMPUTING MACHINERY
Date Issued

2023-01-01

Publisher

Assoc Computing Machinery

Publisher place

New York

Published in
Proceedings Of The 24Th International Conference On Distributed Computing And Networking, Icdcn 2023
ISBN of the book

978-1-4503-9796-4

Start page

80

End page

89

Subjects

Technology

•

Distributed Optimization

•

Resilient Optimization

•

Fault Tolerance

•

Machine Learning

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
DCL  
Event nameEvent placeEvent date
24th International Conference on Distributed Computing and Networking (ICDCN)

Kharagpur, INDIA

JAN 04-07, 2023

FunderGrant Number

Army Research Laboratory

W911NF-17-2-0196

Georgetown University

Available on Infoscience
February 20, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/204346
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés