A prescriptive Dirichlet power allocation policy with deep reinforcement learning

Tian, Yuan; Han, Minghao; Kulkarni, Chetan; Fink, Olga

doi:10.1016/j.ress.2022.108529

research article

A prescriptive Dirichlet power allocation policy with deep reinforcement learning

Tian, Yuan

•

Han, Minghao

•

Kulkarni, Chetan

August 1, 2022

Reliability Engineering & System Safety

Prescribing optimal operation based on the condition of the system, and thereby potentially prolonging its remaining useful lifetime, has tremendous potential in terms of actively managing the availability, maintenance, and costs of complex systems. Reinforcement learning (RL) algorithms are particularly suitable for this type of problem given their learning capabilities. A special case of a prescriptive operation is the power allocation task, which can be considered as a sequential allocation problem whereby the action space is bounded by a simplex constraint. A general continuous action-space solution of such sequential allocation problems has still remained an open research question for RL algorithms. In continuous action space, the standard Gaussian policy applied in reinforcement learning does not support simplex constraints, while the Gaussian-softmax policy introduces a bias during training. In this work, we propose the Dirichlet policy for continuous allocation tasks and analyze the bias and variance of its policy gradients. We demonstrate that the Dirichlet policy is bias-free and provides significantly faster convergence, better performance, and better robustness to hyperparameter changes as compared to the Gaussian-softmax policy. Moreover, we demonstrate the applicability of the proposed algorithm on a prescriptive operation case in which we propose the Dirichlet power allocation policy and evaluate its performance on a case study of a set of multiple lithium-ion (Li-I) battery systems. The experimental results demonstrate the potential to prescribe optimal operation, improving the efficiency and sustainability of multi-power source systems.

Type

research article

DOI

10.1016/j.ress.2022.108529

Web of Science ID

WOS:000800634300003

Author(s)

Tian, Yuan

Han, Minghao

Kulkarni, Chetan

Fink, Olga

Date Issued

2022-08-01

Published in

Reliability Engineering & System Safety

Volume

224

Article Number

108529

Subjects

Engineering, Industrial

•

Operations Research & Management Science

•

Engineering

•

Operations Research & Management Science

•

reinforcement learning

•

deep learning

•

prescriptive operation

•

multi-power source systems

•

resources allocation

•

management

•

strategy

•

system

•

model

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

IIC

Available on Infoscience

June 20, 2022

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/188666