Questions tagged [potential-reward-shaping]

For questions about potential-based reward shaping, which was introduced in the paper "Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping" by Andrew Y. Ng et al, in 1999.

4 questions
6
votes
1 answer

How to improve the reward signal when the rewards are sparse?

In cases where the reward is delayed, this can negatively impact a models ability to do proper credit assignment. In the case of a sparse reward, are there ways in which this can be negated? In a chess example, there are certain moves that you can…
4
votes
1 answer

Expressing Arbitrary Reward Functions as Potential-Based Advice (PBA)

I am trying to reproduce the results for the simple grid-world environment in [1]. But it turns out that using a dynamically learned PBA makes the performance worse and I cannot obtain the results shown in Figure 1 (a) in [1] (with the same…
3
votes
2 answers

What should I do when the potential value of a state is too high?

I'm working on a Reinforcement Learning task where I use reward shaping as proposed in the paper Policy invariance under reward transformations: Theory and application to reward shaping (1999) by Andrew Y. Ng, Daishi Harada and Stuart Russell. In…
2
votes
1 answer

Why does potential-based reward shaping seem to alter the optimal policy in this case?

It is known that every potential function won't alter the optimal policy [1]. I lack of understanding why is that. The definition: $$R' = R + F,$$ with $$F = \gamma\Phi(s') - \Phi(s),$$ where, let's suppose, $\gamma = 0.9$. If I have the following…