13

I am currently trying to train a bot for a game I am creating. It is a 2d game with a complex map made of various shapes. The bot and character shoot bullets that are capable of ricocheting. The neural network outputs a vector in which the bot will turn and then fire. I myself cannot calculate the correct trajectory and find the Loss of the network. But, I can give a rating on how well the neural network performs when it fires. Can I train it by simply giving it a rating, and if, how so?

Trang Oul
  • 103
  • 3
Beluker
  • 133
  • 1
  • 5

2 Answers2

21

Yes, this is a stereotypical reinforcement learning problem. Instead of trying to calculate the dynamics of the environment the agent is given a reward or punishment for its behavior in the environment (shooting an enemy, etc). The training process tries to find a policy to maximize the reward signal. It’s a rather deep field and a bit much for a single post. Maybe try reading one or two papers on Deep Q Network (DQN) or Proximal policy optimization (PPO) and see if either can be formulated in a way to work for your problem.

foreverska
  • 2,347
  • 4
  • 21
14

Yes, but we're talking about a lot of ratings, like, millions of ratings. You have to automate rating generation, human feedback would just get way too expensive.