Can a typical supervised learning problem be solved with reinforcement learning methods?

Asked May 05 '20 at 08:16

Active May 05 '20 at 11:03

Viewed 66 times

Let's say I want to teach a neural to classify images, and, for some reason, I insist on using reinforcement learning rather than supervised learning.

I have a dataset of images and their matching classes. Then, for each image, I could define a reward function which is $1$ for classifying it right and $-1$ for classifying it wrong (or perhaps even define a more complicated reward function where some mistakes are less costly than others). For each image $x^i$, I can loop through each class $c$ and use a vanilla REINFORCE step: $\theta = \theta + \alpha \nabla_{\theta}log \pi_{\theta}(c|x^i)r$.

Would that be different than using standard supervised learning methods (for example, the cross-entropy loss)? Should I expect different results?

This method actually seems better since I could define a custom reward for each misclassification, but I've never seen anyone use something like that

edited May 05 '20 at 11:03

nbro

42,615
12
119
217

asked May 05 '20 at 08:16

Gilad Deutsch

Can a typical supervised learning problem be solved with reinforcement learning methods?

0 Answers0