Your quote is an exaggeration: $H_1$ may be diagonalized, which then fixes the ambiguous parts of the basis of the degenerate $H_0$. "Already" is out of place in an ambiguous basis.
If $[H_0,H_1]=0$, then $H_0,~H_1, ~ H=H_0+\lambda H_1$ can be simultaneously diagonal matrices. The zeroth order basis is ambiguous in the degenerate subspace, but you have already diagonalized the hamiltonian, if you have diagonalized $H_1$ which breaks the degeneracy, $\langle n_0^i | H_1|n_0^j \rangle \propto \delta_{ij}$.
You have thus found the full Hamiltonian's full eigenstates, to all orders in λ, hence merely the first: the first order shifted energy is an exact expression,
$$
E^j= E^j_0 + \lambda \langle n_0^j | H_1|n_0^j \rangle.
$$
- No perturbative approximation is needed, as you have already diagonalized the full hamiltonian.
The point is $H_1$, assuming it is not degenerate itself, helps you specify the orthogonal eigenstates it dictates by breaking the degeneracy, i.e. rotates the arbitrary zeroth order eigenstates to eigenstates of it as well.
Illustrate by $H_0=$diag(1,1,2), and $$
H_1=\begin{bmatrix} 3/2 &-1/2&0\\
-1/2&3/2&0\\
0&0&3\end{bmatrix}.
$$
subsequently diagonalized to $H_1=$diag(1,2,3), without affecting $H_0$. It is $H_1$ that fixes the above basis dubbed $|n_0^j\rangle$, not $H_0$.
(Freak lark logical possibility: You need both pieces to break each others's degeneracy in a common diagonal outcome, e.g., for $H_0=$diag(1,1,2), $H_1=$diag(1,2,2). One should never get one's students there... They should spend their time on useful things...)