What research has been done on learning non-Markovian reward functions?

Asked Apr 07 '19 at 17:45

Active Dec 19 '21 at 18:47

Viewed 84 times

Recently, some work has been done planning and learning in Non-Markovian Decision Processes, that is, decision-making with temporally extended rewards. In these settings, a particular reward is received only when a particular temporal logic formula is satisfied (LTL or CTL formula). However, I cannot find any work about learning which rewards correspond to which temporally extended behavior.

In my searches, I came across k-order MDPs (which are non-Markovian). I did not find RL research done on k-order MDPs.

edited Dec 19 '21 at 18:47

nbro

42,615
12
119
217

asked Apr 07 '19 at 17:45

Gavin Rens

What research has been done on learning non-Markovian reward functions?

0 Answers0