I am currently trying to train a bot for a game I am creating. It is a 2d game with a complex map made of various shapes. The bot and character shoot bullets that are capable of ricocheting. The neural network outputs a vector in which the bot will turn and then fire. I myself cannot calculate the correct trajectory and find the Loss of the network. But, I can give a rating on how well the neural network performs when it fires. Can I train it by simply giving it a rating, and if, how so?
Asked
Active
Viewed 6,219 times
2 Answers
21
Yes, this is a stereotypical reinforcement learning problem. Instead of trying to calculate the dynamics of the environment the agent is given a reward or punishment for its behavior in the environment (shooting an enemy, etc). The training process tries to find a policy to maximize the reward signal. It’s a rather deep field and a bit much for a single post. Maybe try reading one or two papers on Deep Q Network (DQN) or Proximal policy optimization (PPO) and see if either can be formulated in a way to work for your problem.
foreverska
- 2,347
- 4
- 21
14
Yes, but we're talking about a lot of ratings, like, millions of ratings. You have to automate rating generation, human feedback would just get way too expensive.
Emilio M Bumachar
- 241
- 1
- 3