In the information theoretic treatment of say, the canonical ensemble, one must maximise the Shannon entropy
$$ S =-\sum_i p_i \ln p_i$$
where $p_i$ is the subjective probability for occupying state $\lvert i \rangle$, subject to the constraint that the mean energy is a constant $U$, which for Hamiltonian $H$ can be written as
$$ U = \sum_i p_i \langle i \vert H \vert i \rangle.$$
However, it is always assumed that the states $\lvert i \rangle$ are energy eigenstates with energy $E_i$, and indeed this assumption is essential to obtaining the Gibbs' distribution. Clearly if we worked in some other basis we would get nonsense.
In other words, it seems like there is an implicit assumption that we are measuring the energy of the system, and then requiring that on average this gives $U$, as opposed to measuring a more general observable, and requiring that the quantum expectation value on average gives $U$.
I am wondering if this is correct, and if so, what is the interpretation of this energy measurement in terms of thermal contact with a heat bath? Is the heat bath somehow 'measuring' the energy of the system, and why energy specifically?