Questions tagged [lunar-lander]

For questions related to the Lunar Lander environment used in reinforcement learning.

1 questions
3
votes
1 answer

Why is training longer not better in reinforcement learning?

I have trained an RL agent (PPO) for 6 million steps to solve the OpenAI gym LunarLander-v2. Surprisingly, the agent performs best already after 320K steps and is getting worse after that. In the tensorboard log, I can see that the mean, min reward…