Background
I am not entirely sure that statistical mechanics is necessary, as the resulting effect is a macroscopic one, thus applicable to thermodynamics. However, I will touch on the statistical mechanics aspects a bit.
Much of my answer derives from notes from the thermal expansion Wikipedia page (https://en.wikipedia.org/wiki/Thermal_expansion) and the book by Zel'dovich and Raizer [2002].
Definitions
Let us first start with some definitions of parameters, in no particular order.
- Constants
- Variables
- $T$ = the scalar temperature
- $V$ = the scalar volume
- $\epsilon$ = the specific internal energy
- $P$ = the scalar pressure
- Subscripts
- $c$ = for components relating to elastic effects (i.e., exclusively due to forces of interaction between atoms/molecules)
- $T$ = due to thermal motion of atoms/molecules
- $e$ = due to thermal excitation of electrons
- $o$ = initial or standard condition/value
- Coefficients/Parameters
- $\kappa_{o} = \tfrac{ 1 }{ V_{o} } \left( \tfrac{ \partial V }{ \partial P } \right)_{T}$ = the isothermal compressibility
- $\alpha = \tfrac{ 1 }{ V_{o} } \left( \tfrac{ \partial V }{ \partial T } \right)_{P}$ = the isobaric thermal expansion coefficient
- $C_{v}$ = specific heat at constant volume = $3 \ k_{B} \ N$, where $N$ is the number of atoms/molecules
- $\Gamma \left( V \right)$ = Grüneisen coefficient
- $C_{so}$ = isothermal speed of sound = $\left( \tfrac{ V_{o} }{ \kappa_{o} } \right)^{1/2} = \left( - V^{2} \tfrac{ d P_{c} }{ d V } \right)^{1/2}$, where $N$ is the number of atoms/molecules
- Thermodynamic Identities
- $\left( \tfrac{ \partial \epsilon }{ \partial V } \right)_{T} = T \left( \tfrac{ \partial P }{ \partial T } \right)_{V} - P$
- $-1 = \left( \tfrac{ \partial P }{ \partial T } \right)_{V} \left( \tfrac{ \partial T }{ \partial V } \right)_{P} \left( \tfrac{ \partial V }{ \partial P } \right)_{T}$
Approach
We can separate $P$ and $\epsilon$ in three parts: elastic, thermal motion/vibration, and thermal electron excitation. The last part is generally quite small unless the material is at very high temperatures (e.g., $T \gtrsim 10^{4} \ K$). Then we can define:
$$
\begin{align}
\epsilon & = \epsilon_{c}\left( V \right) + 3 \ N \ k_{B} \ T \tag{1a} \\
P & = P_{c}\left( V \right) + P_{T}\left( V, T \right) \tag{1b}
\end{align}
$$
I discussed the physical interpretation of the elastic terms in another answer at https://physics.stackexchange.com/a/216180/59023:
Let us define $P$ as the pressure and $\epsilon$ as the internal energy of a solid material. These can be divided into two parts: an elastic (subscript $c$) and thermal part. $P_{c}$ and $\epsilon_{c}$ depend only upon the density of the material, $\rho$, or the specific volume, $V$ = $1/\rho$. These are equal to the total pressure and specific internal energy at absolute zero or $T = 0 \ K$. Let us assume that the specific volume at $T = 0$ and $P = 0$ is given by $V_{oc}$, which is only ~1-2% smaller than the specific volume at STP, $V_{o}$, for most metals.
The potential energy curve, or curve defining $\epsilon_{c}$, is qualitatively similar to the potential energy curve describing the interaction between two atoms as a function of the intranuclear distance, $\Delta x_{n}$. When $V > V_{oc}$, the attractive forces dominate but fall off rapidly as the intranuclear distances increase (e.g., as $T$ increases). In other words, when the atoms move further apart $\epsilon_{c}$ will asymptotically increase to some value $U$, which is roughly the binding energy of the atoms in the body. Thus, $U$ represents the energy required to remove all atoms from the object to infinity, which is roughly equal to the heat of vaporization for the material ... Thus, $\epsilon_{c} (V) \rightarrow U$ as $\Delta x_{n} \rightarrow \sim 2$.
Conversely, the repulsive forces dominate if $V < V_{oc}$. We can define this quantitatively by considering that the work done by compressing the material will be equal to the increase in internal energy. In other words:
$$
P_{c} = - \left( \frac{ d \epsilon_{c} }{ d V } \right)_{T = 0}
$$
which is equivalent to saying it is the isothermic/isentropic equation for cold compression. The negative sign shows that if a tensile force were applied to the body, the binding forces between atoms would act as a restoring force.
If we use the first thermodynamic identity above we find two results, one for the elastic parameters and one for the thermal:
$$
\begin{align}
\left( \frac{ \partial \epsilon_{c} }{ \partial V } \right)_{T} & = \frac{ d \epsilon_{c} }{ d V } = - P_{c}\left( V \right) \tag{2a} \\
\left( \frac{ \partial \epsilon_{T} }{ \partial V } \right)_{T} & = 0 = T \left( \frac{ \partial P_{T} }{ \partial T } \right)_{V} - P_{T} \tag{2b}
\end{align}
$$
In the linear limit, Equation 2b suggests we can express $P_{T}$ as a function dependent upon volume times the temperature. This allows use to write $P_{T}$ as:
$$
\begin{align}
P_{T} & = T \left( \frac{ \partial P_{T} }{ \partial T } \right)_{V} = \phi\left( V \right) \ T \tag{3a} \\
& = \Gamma \left( V \right) \frac{ C_{v} \ T }{ V } \tag{3b} \\
& = \Gamma \left( V \right) \frac{ \epsilon_{T} }{ V } \tag{3c}
\end{align}
$$
Note that Equation 3 shows that $\Gamma \left( V \right)$ is a parameter that characterizes the ratio of $P_{T}$ to $\epsilon_{T}$, called the Grüneisen coefficient/parameter.
If we use the second thermodynamic identity above and Equation 3a combined with the definitions for $\kappa_{o}$ and $\alpha$, we can show:
$$
\begin{align}
-1 & = \left( \frac{ \partial P }{ \partial T } \right)_{V} \left( \frac{ \partial T }{ \partial V } \right)_{P} \left( \frac{ \partial V }{ \partial P } \right)_{T} \tag{4a} \\
& = - \kappa_{o} \ V_{o} \ \left( \frac{ \partial P }{ \partial T } \right)_{V} \left( \frac{ \partial T }{ \partial V } \right)_{P} \tag{4b} \\
& = - \frac{ \kappa_{o} \ V_{o} }{ \alpha \ V_{o} } \left( \frac{ \partial P }{ \partial T } \right)_{V} \tag{4c} \\
& = - \frac{ \kappa_{o} \ P_{T} }{ \alpha \ T } \tag{4c} \\
& = - \frac{ \kappa_{o} \ C_{v} }{ \alpha \ V_{o} } \ \Gamma \left( V_{o} \right) \tag{4d}
\end{align}
$$
In the limit of low temperature, we can see from Equation 4d that:
$$
\Gamma \left( V_{o} \right) \equiv \Gamma_{o} = \frac{ \alpha \ V_{o} }{ \kappa_{o} \ C_{v} } \tag{5}
$$
which shows that $\Gamma_{o}$ is independent of temperature. This is only true for cases when $P_{e}$ and $\epsilon_{e}$ are small. We can also relate $\Gamma_{o}$ to the isothermal sound speed by:
$$
\Gamma_{o} = \frac{ \alpha \ C_{so}^{2} }{ C_{v} } \tag{6}
$$
In the limit of high temperature (i.e., when $k_{B} \ T \gg h \ \nu$ or when thermal pressures/energies dominate over vibrational [for lattice configurations]), the specific free energy is given by:
$$
F = \epsilon_{c}\left( V \right) + 3 \ N \ k_{B} \ T \ \ln \frac{ h \ \bar{\nu} }{ k_{B} \ T} \tag{7}
$$
where $\bar{\nu} = e^{-1/3} \ k_{B} \ \theta$ is the average vibrational frequency and $\theta$ is the Debye temperature. We can relate $F$ to $P$ and $\epsilon$ through the following:
$$
\begin{align}
\epsilon & = F - T \frac{ \partial F }{ \partial T } \tag{8a} \\
& = \epsilon_{c}\left( V \right) + 3 \ N \ k_{B} \ T \tag{8b} \\
& = \epsilon_{c} + \epsilon_{T} \tag{8c} \\
P & = - \frac{ \partial F }{ \partial V } \tag{8d} \\
& = - \frac{ \partial \epsilon_{c} }{ \partial V } - 3 \ N \ k_{B} \ T \frac{ \partial \ln h \ \bar{\nu} }{ \partial V } \tag{8e} \\
& = P_{c} + \epsilon_{T} \frac{ \partial \ln h \ \bar{\nu} }{ \partial V } \tag{8f} \\
& = P_{c} + P_{T} \tag{8g}
\end{align}
$$
We can use Equation 8 and 3c to show that:
$$
\begin{align}
P_{T} = \epsilon_{T} \frac{ \partial \ln h \ \bar{\nu} }{ \partial V } & = \Gamma \left( V \right) \frac{ \epsilon_{T} }{ V } \tag{9a} \\
& = \frac{ V \ P_{T} }{ \Gamma \left( V \right) } \frac{ \partial \ln h \ \bar{\nu} }{ \partial V } \tag{9b} \\
& = \frac{ P_{T} }{ \Gamma \left( V \right) } \frac{ \partial V }{ \partial \ln V } \frac{ \partial \ln h \ \bar{\nu} }{ \partial V } \tag{9c} \\
& = \frac{ P_{T} }{ \Gamma \left( V \right) } \frac{ \partial \ln h \ \bar{\nu} }{ \partial \ln V } \tag{9d}
\end{align}
$$
which gives us the expression for $\Gamma \left( V \right)$ in terms of $\bar{\nu}$, given by:
$$
\Gamma \left( V \right) = \frac{ \partial \ln h \ \bar{\nu} }{ \partial \ln V } \tag{10}
$$
We can approximate $\bar{\nu}$ as ~$C_{so}/r_{o}$, where $r_{o}$ is the interatomic spacing, but this means that
$$
\bar{\nu} \sim V^{2/3} \left( - \frac{ d P_{c} }{ d V } \right) \tag{11}
$$
Answer
We can now see why solids expand when heated, if we examine Equations 1b and 11. If one heats a solid, then $P_{T}$ increases. Thus, upon heating the elastic pressure, $P_{c}$, becomes negative in order to maintain pressure balance, but the body of the material expands [quote on page 700 of Zel'dovich and Raizer, 2002]:
...up to that point when the binding forces holding the atoms in the lattice, or the negative pressure, will no longer counterbalance the repulsive effect of the positive thermal pressure...
References
- Zel'dovich, Ya.B., and Yu.P. Raizer (2002) Physics of Shock Waves and High-Temperature Hydrodynamic Phenomena, Ed. by W.D. Hayes and R.F. Probstein, Mineola, NY, Dover Publications, inc., The Dover Edition; ISBN-13: 978-0486420028.