Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Credit Assignment Safety Learning from Human Demonstrations
 
conference paper not in proceedings

Credit Assignment Safety Learning from Human Demonstrations

Prabhakar, Ahalya  
•
Billard, Aude  orcid-logo
2021
AAAI 2021 Fall Symposium Series: Artificial Intelligence for Human-Robot Interaction (AI-HRI)

A critical need in assistive robotics, such as assistive wheelchairs for navigation, is a need to learn task intent and safety guarantees through user interactions in order to ensure safe task performance. For tasks where the objectives from the user are not easily defined, learning from user demonstrations has been a key step in enabling learning. However, most robot learning from demonstration (LfD) methods primarily rely on optimal demonstration in order to successfully learn a control policy, which can be challenging to acquire from novice users. Recent work does use suboptimal and failed demonstrations to learn about task intent; few focus on learning safety guarantees to prevent repeat failures experienced, an essential requirement for assistive robots. Furthermore, interactive human-robot learning aim to minimize effort from the human user in order to feasibly facilitate deployment in the real-world. As such, requiring users to go through and label the unsafe states or keyframes from the demonstrations should not be a necessary requirement for learning. Here, we propose an algorithm to learn a safety value function from a set of suboptimal and failed demonstrations that is used to generate a real-time safety control filter. Importantly, we develop a credit assignment method that extracts the failure states from the failed demonstrations without requiring human labelling or prespecified knowledge of unsafe regions. Furthermore, we extend our formulation to allow for user-specific safety functions, by incorporating user-defined safety rankings from which we can generate safety level sets according to the users' preferences. By using both suboptimal and failed demonstrations and the developed credit assignment formulation, we enable learning a safety value function with minimal effort needed from the user, making it more feasible for widespread use in human-robot interactive learning tasks.

  • Details
  • Metrics
Type
conference paper not in proceedings
Author(s)
Prabhakar, Ahalya  
Billard, Aude  orcid-logo
Date Issued

2021

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LASA  
Event nameEvent placeEvent date
AAAI 2021 Fall Symposium Series: Artificial Intelligence for Human-Robot Interaction (AI-HRI)

Virtual

November 4-6, 2021

Available on Infoscience
December 14, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/193237
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés