I have recently posted a question here about a problem that I have controlling a robotic arm. Basically I have a dense reward for the arms position, and a sparse reward for the arms stiffness: Reward shaping for dense and sparse rewards
I am using PPO at the moment in my attempts to solve it but with little success. Now I learned about specialised Multi-Objective RL algorithms and think that they might be suited better.
Still I am wondering why that would be or why we would need a separate algorithm for MOO. Can we not always put all reward for multiple objectives into some formula to form a single reward? Weighted sum, ratio, difference, whatever?