I have a custom environment for stock trading where an episode can be as long as 2000-3000 steps. I've run several experiments with td3 and sac algorithms, average reward per episode flattens after few episodes. I believe average reward per episode should further improve, so I thought whether my training episode is too long. What is the recommended upper limit on the episode length?
Asked
Active
Viewed 3,585 times
5