Abstract

This paper proposes a safe reinforcement learning algorithm for generation bidding decisions and unit maintenance scheduling in a competitive electricity market environment. In this problem, each unit aims to find a bidding strategy that maximizes its revenue while concurrently retaining its reliability by scheduling preventive maintenance. The maintenance scheduling provides some safety constraints which should be satisfied at all times. Meeting the critical safety and reliability requirements when the generation units have incomplete information regarding each other's bidding strategy is a challenging problem. Bi-level optimization and reinforcement learning are state-of-the-art approaches for solving this type of problem. However, neither bi-level optimization nor reinforcement learning can handle the challenges of incomplete information and critical safety constraints. To tackle these challenges, we propose the safe deep deterministic policy gradient reinforcement learning algorithm, which is based on a combination of reinforcement learning and a predicted safety filter. The case study demonstrates that the proposed approach can yield a higher profit compared to other state-of-the-art methods while concurrently satisfying the system safety constraints. Moreover, the case study shows that the reward of the learning algorithm with incomplete information can converge to a reward of the complete information game.

Details