Algorithms for average reward reinforcement learning in continuous/general state-action space

Asked Oct 10 '23 at 19:47

Active Oct 10 '23 at 19:47

Viewed 44 times

I see that discounted reward reinforcement learning has been extensively studied in the literature. However, the average reward metric receives less attention, and it looks like algorithms for this metric (R-learning, H-learning, SMART, etc.) are less than the discount metric. Could you suggest any algorithms for average reward reinforcement learning in continuous or general state-action space?

asked Oct 10 '23 at 19:47

k2pctdn

Algorithms for average reward reinforcement learning in continuous/general state-action space

0 Answers0