I'm currently reading chapter 2.2 of Sakurai where the Heisenberg picture is detailed. However, in the text Sakurai doesn't treat the most general case of a time-dependent hamiltonian $\mathcal{H}(t)$, with the condition that $[\mathcal{H}(t_1), \mathcal{H}(t_2)] \neq 0$. Of course, this also implies that $[\mathcal{H}(t), U(t_0;t_2)] \neq 0$. Where $U(t_0;t)$ is the time evolution operator from $t_0$ to $t$.
I tried to derive the Heisenberg EOM myself for the most general case. Here is my progress:
For an operator, $A(t)$ denoted $A^{(H)}(t)$ in the Heisenberg picture and $A^{(S)}(t)$ in the Schrödinger picture. I am assuming here that this operator has explicit time dependence. Of course, these operators are related by
$$A^{(H)}(t) = U^{(S)\dagger}(t_0;t) A^{(S)}(t) U^{(S)}(t_0;t)$$
Where $U^{(S)}(t_0;t)$ is the unitary time evolution operator from $t_0$ to $t$, in the Shrödinger picture satisfying: $$i \hbar \frac{dU^{(S)}(t_0;t)}{dt} = \mathcal{H}^{(S)}(t)U^{(S)}(t_0;t) \\ -i \hbar \frac{dU^{(S)\dagger}(t_0;t)}{dt} = U^{(S)\dagger}(t_0;t)\mathcal{H}^{(S)}(t).$$ In particular notice that the Hamiltonian $\mathcal{H}^{(H)}(t) \neq \mathcal{H}^{(S)}(t)$ as $\mathcal{H}^{(S)}$ doesn't necessarily commute with $U(t_0;t')$ for different times. Furthermore, $U^{(H)}(t_0;t) \neq U^{(S)}(t_0;t)$ as the time evolution operator is also not guaranteed to commute with itself at different times in general (right?).
Now, we simply take the total time derivative of $A^{(H)}(t)$ to figure out the time dependence
$$ \frac{d A^{(H)}}{dt} = \frac{d}{dt} \left( U^{(S)\dagger}(t_0;t) A^{(S)}(t) U^{(S)}(t_0;t) \right)$$
$$=\frac{d U^{(S) \dagger}(t_0; t)}{dt}A^{(S)}(t)U^{(S)}(t_0;t) + U^{(S) \dagger}(t_0;t) A^{(S)}(t) \frac{d U^{(S)}(t_0,t)}{dt} + U^{(S) \dagger}(t_0;t) \frac{d A^{(S)}(t)}{dt} U^{(S)}(t_0,t).$$
Now since the Schrödinger operator has explicit time dependence we have that $\frac{d A^{(S)}}{dt} = \frac{\partial A^{(S)}}{\partial t}$. Upon plugging in for the time derivatives of the time evolution operator, using the differential equations above we find
$$=\frac{1}{i \hbar} \left[U^{(S)\dagger}(t_0;t)A^{(S)}(t) \mathcal{H}^{(S)}(t)U^{(S)}(t_0;t)-U^{(S) \dagger}(t_0;t) \mathcal{H}^{(S)}(t)A^{(S)}(t)U^{(S)}(t_0;t) \right] + U^{(S) \dagger}(t_0;t)\frac{\partial A^{(S)}(t)}{\partial t} U^{(S)}(t_0;t).$$
Inserting the identity $\mathbb{1} = U^{(S)}(t_0; t)U^{(S)^\dagger}(t_0;t)$ between $A^{(S)}(t)$ and $\mathcal{H}^{(S)}(t)$ in the first and second terms we find
$$ =\frac{1}{i \hbar} \left[U^{(S)\dagger}(t_0;t)A^{(S)}(t) \left(U^{(S)}(t_0; t)U^{(S)\dagger}(t_0;t)\right) \mathcal{H}^{(S)}(t)U^{(S)}(t_0;t) - U^{(S) \dagger}(t_0;t) \mathcal{H}^{(S)}(t)\left(U^{(S)}(t_0; t)U^{(S)\dagger}(t_0;t)\right)A^{(S)}(t)U^{(S)}(t_0;t) \right] + U^{(S) \dagger}(t_0;t)\frac{\partial A^{(S)}(t)}{\partial t} U^{(S)}(t_0;t) $$
$$ =\frac{1}{i \hbar} \left[\left(U^{(S)\dagger}(t_0;t)A^{(S)}(t) U^{(S)}(t_0; t)\right) \left(U^{(S)\dagger}(t_0;t) \mathcal{H}^{(S)}(t)U^{(S)}(t_0;t) \right) - \left(U^{(S) \dagger}(t_0;t) \mathcal{H}^{(S)}(t) U^{(S)}(t_0; t)\right) \left(U^{(S)\dagger}(t_0;t)A^{(S)}(t)U^{(S)}(t_0;t) \right) \right] + U^{(S) \dagger}(t_0;t)\frac{\partial A^{(S)}(t)}{\partial t} U^{(S)}(t_0;t). $$ Now, we simply switch to the Heisenberg representation
$$= \frac{1}{i \hbar} \left[A^{(H)}(t), \mathcal{H}^{(H)}(t) \right] + \left[ \frac{\partial A^{(S)}(t)}{\partial t}\right]^{(H)}.$$
Written a bit cleaner the final result is then
$$ \frac{d A^{(H)}(t)}{dt} = \frac{1}{i \hbar} \left[A^{(H)}(t), \mathcal{H}^{(H)}(t) \right] + \left[ \frac{\partial A^{(S)}(t)}{\partial t}\right]^{(H)}. $$
My confusion is with the last term $\left[ \frac{\partial A^{(S)}(t)}{\partial t}\right]^{(H)}$. In the stack exchange posts, I looked at before this term is usually just written as $\frac{\partial A}{\partial t}$ or even sometimes as $\frac{\partial A^{(H)}(t)}{\partial t}$ as in this post.
Are all of these definitions equivalent even in the most general case? I would imagine that $U(t_0; t)$ wouldn't necessarily commute with the time derivative operator in general (would it?) in which case $$\frac{\partial A^{(H)}(t)}{\partial t} = \frac{\partial}{\partial t} \left(U^{(S)\dagger}(t_0;t) A^{(S)}(t) U^{(S)}(t_0;t) \right) = U^{(S)\dagger}(t_0;t) \frac{ \partial A^{(S)}(t)}{\partial t} U^{(S)}(t_0;t) = \left( \frac{\partial A^{(S)}(t)}{\partial t}\right)^{(H)}. $$
What am I missing here?