For questions about multi-objective reinforcement learning (MORL).
Questions tagged [multi-objective-rl]
7 questions
13
votes
3 answers
Why is the reward in reinforcement learning always a scalar?
I'm reading Reinforcement Learning by Sutton & Barto, and in section 3.2 they state that the reward in a Markov decision process is always a scalar real number. At the same time, I've heard about the problem of assigning credit to an action for a…
user40138
4
votes
2 answers
Can rewards be decomposed into components?
I'm training a robot to walk to a specific $(x, y)$ point using TD3, and, for simplicity, I have something like reward = distance_x + distance_y + standing_up_straight, and then it adds this reward to the replay buffer. However, I think that it…
pinkie pAI
- 45
- 3
2
votes
1 answer
Why does Multi Objective RL exist?
I have recently posted a question here about a problem that I have controlling a robotic arm.
Basically I have a dense reward for the arms position, and a sparse reward for the arms stiffness: Reward shaping for dense and sparse rewards
I am using…
mavex857
- 53
- 2
2
votes
1 answer
Can the rewards be matrices when using DQN?
I have a basic question. I'm working towards developing a reward function for my DQN. I'd like to train an RL agent to edit pixels on an image. I understand that convolutions are ideal for working with images, but I'd like to observe the agent doing…
junfanbl
- 323
- 1
- 7
2
votes
1 answer
What are preferences and preference functions in multi-objective reinforcement learning?
In RL (reinforcement learning) or MARL (multi-agent reinforcement learning), we have the usual tuple:
(state, action, transition_probabilities, reward, next_state)
In MORL (multi-objective reinforcement learning), we have two more additions to the…
Huan
- 171
- 1
- 6
2
votes
1 answer
What are some simple open problems in multi-agent RL that would be suited for a bachelor's thesis?
I've decided to make my bachelor thesis in RL. I am currently struggling to find a good problem. I am interested in multi-agent RL with the dilemma between selfishness and cooperation.
I only have 2 months to complete this and I'm afraid that…
Rom
- 139
- 4
0
votes
0 answers
Optimizing a nonlinear objective function in Deep Reinforcement Learning
I'm working on a reinforcement learning problem where the environment returns a reward pair $(r_{t+1}^{(a)}, r_{t+1}^{(b)})$. The goal is to maximize the following nonlinear objective function.
$$
E[\lim_{T \to \infty } \frac{ \sum_{r=t}^{t+T-1}…
Alex
- 1