What exactly does meta-learning in reinforcement learning setting mean?

Question

We can use DDPG to train agents to stack objects. And stacking objects can be viewed as first grasping followed by pick and place. In this context, how does meta-reinforcement learning fit? Does it mean I can use grasp, pick and place as training tasks and generalize to assembling objects?

score 0 · Answer 1 · answered Dec 10 '21 at 16:02

Meta Learning can mean many things, but at the core is about having a second layer of optimisation, beside the usual one needed to solve your task.

For instance, in RL for robotics you may have a SAC (Soft Actor Critic) agent that learns how to pick and place, by first initialising a random neural network and then learning which weights minimise a loss function related to successful picks. Given this architecture you can fix a meta-goal, for instance being precise (base-goal) and fast (meta-goal) at picking. Or maximising human safety, minimising robot wear, and so on.

Now you can meta-learn the best meta-parameters to achieve this meta-goal. Example of meta-parameters could be the network initialisation, the shape of the loss function, the network architecture, etc.

Check out Meta-Learning in Neural Networks: A Survey https://arxiv.org/abs/2004.05439

What exactly does meta-learning in reinforcement learning setting mean?

1 Answers1