3

We can use DDPG to train agents to stack objects. And stacking objects can be viewed as first grasping followed by pick and place. In this context, how does meta-reinforcement learning fit? Does it mean I can use grasp, pick and place as training tasks and generalize to assembling objects?

nbro
  • 42,615
  • 12
  • 119
  • 217
Sofi
  • 31
  • 1

1 Answers1

0

Meta Learning can mean many things, but at the core is about having a second layer of optimisation, beside the usual one needed to solve your task.

For instance, in RL for robotics you may have a SAC (Soft Actor Critic) agent that learns how to pick and place, by first initialising a random neural network and then learning which weights minimise a loss function related to successful picks. Given this architecture you can fix a meta-goal, for instance being precise (base-goal) and fast (meta-goal) at picking. Or maximising human safety, minimising robot wear, and so on.

Now you can meta-learn the best meta-parameters to achieve this meta-goal. Example of meta-parameters could be the network initialisation, the shape of the loss function, the network architecture, etc.

Check out Meta-Learning in Neural Networks: A Survey https://arxiv.org/abs/2004.05439

Rexcirus
  • 1,309
  • 9
  • 22