Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Efficiently Escaping Saddle Points for Policy Optimization
 
conference paper

Efficiently Escaping Saddle Points for Policy Optimization

Khorasani, Mohammadsadegh  
•
Salehkaleybar, Saber  
•
Kiyavash, Negar  
Show more
August 21, 2025
Proceedings of the 41st Conference on Uncertainty in Artificial Intelligence
41th Conference on Uncertainty in Artificial Intelligence (UAI 2025)

Policy gradient (PG) is widely used in reinforcement learning due to its scalability and good performance. In recent years, several variance-reduced PG methods have been proposed with a theoretical guarantee of converging to an approximate first-order stationary point (FOSP) with the sample complexity of O(ϵ −3). However, FOSPs could be bad local optima or saddle points. Moreover, these algorithms often use importance sampling (IS) weights which could impair the statistical effectiveness of variance reduction. In this paper, we propose a variance-reduced second-order method that uses second-order information in the form of Hessian vector products (HVP) and converges to an approximate second-order stationary point (SOSP) with sample complexity of Õ(ϵ −3). This rate improves the best-known sample complexity for achieving approximate SOSPs by a factor of O(ϵ −0.5). Moreover, the proposed variance reduction technique bypasses IS weights by using HVP terms. Our experimental results show that the proposed algorithm outperforms the state of the art and is more robust to changes in random seeds.

  • Files
  • Details
  • Metrics
Type
conference paper
Author(s)
Khorasani, Mohammadsadegh  

EPFL

Salehkaleybar, Saber  

LIACS

Kiyavash, Negar  

EPFL

He, Niao

ETHZ

Grossglauser, Matthias  

EPFL

Date Issued

2025-08-21

Published in
Proceedings of the 41st Conference on Uncertainty in Artificial Intelligence
Series title/Series vol.

PMLR; 244

Start page

2143

End page

2162

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
BAN  
INDY1  
Event nameEvent acronymEvent placeEvent date
41th Conference on Uncertainty in Artificial Intelligence (UAI 2025)

UAI 2025

Rio de Janiro, Brazil

2025-07-21-2025-07-26

Available on Infoscience
September 30, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/254505
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés