Does apprenticeship learning require prospective data?

Asked Feb 19 '20 at 11:32

Active Feb 19 '20 at 18:06

Viewed 35 times

I am thinking of applying apprenticeship learning on retrospective data. From looking at this paper by Ng https://ai.stanford.edu/~ang/papers/icml04-apprentice.pdf which talks about apprenticeship learning, it seems to me that at the 5th step of the algorithm,

Compute (or estimate) $μ^{(i)}$ = $μ(π^{(i)})$, where $\mu^{(i)}$ = $E[\sum_{t=0}^{∞}\gamma^{t}$$\phi(s_{t})$ | $\pi^{(i)}]$, $\phi(s_{t})$ is the reward feature vector at state $s_t$.

From my understanding, a sequence of $s_0, s_1, s_2 ..$ trajectory would have to be generated at this step, following this policy $\pi^{(i)}$. Hence, applying this algorithm on retrospective data would not work?

edited Feb 19 '20 at 18:06

nbro

42,615
12
119
217

asked Feb 19 '20 at 11:32

calveeen

1,311
9
18

Does apprenticeship learning require prospective data?

0 Answers0