0

enter image description here

Hi! I have just made my first model in stable-baselines3 using pygame in Python. The game is about a ball reaching the highest platform out of three placed in the sky.

Now - after a few days of trying I managed to make the model learn how to get there. But then after reaching the third platform it falls out of the map. I wanted to train a new model that would solve this issue but to my surprise increasing total_timesteps to 500_000 completely fails the test - the ball just jumps in one place, whereas the model with much fewer - 150_000 got to the highest platform!

Why is that?

Shouldn't more timesteps converge even more into staying on the highest platform and not falling off?

Here's my call function

model = PPO("MlpPolicy", env, verbose=1, learning_rate=0.0002, ent_coef=0.2)

model.learn(total_timesteps=500_000)

Skorejen
  • 101
  • 1

0 Answers0