So typically, we can calculate the thermodynamic characteristic function for any system using different ensembles. How do we prove in general that in thermodynamic limit, they all converge to the same result? For finite system, I believe the result for different ensembles should be different.
2 Answers
You are correct, for finite systems, the ensembles give different results. The thermodynamic limit is a crucial assumption to prove the equivalence of the ensembles. At low level, the standard proof is to assume large deviations and apply Laplace's method.
A good example is worth a thousand words. Consider independent, classical two state systems, say Ising spins with energy $\pm1$. Let $N$ be the number of Ising spins, a microstate is defined by $\sigma\in\{-1,+1\}^N$ and the total energy of the system is: $$ U[\sigma]=\sum_{i=1}^N\sigma_i $$ In the microcanonical ensemble, you fix energy $U$ and you compute $W(U)=e^S$, the number of corresponding microstates. By combinatorics, you can check that: $$ W=\binom N{\frac{N+U}2}=\frac{N!}{\left(\frac{N+U}2\right)!\left(\frac{N-U}2\right)!} $$ On the other hand, you fix temperature for the canonical ensemble. With inverse temperature $\beta$, you can compute the partition function $Z(\beta)=e^{-\beta F}$: $$ Z=(2\cosh\beta)^N $$ For finite $N$, the two ensembles do not agree. The microcanonical ensemble is only properly defined for a finite number of possible energies: $$E\in\{-N,-N+2,...,N\}$$ Therefore, strictly speaking, it does not even have a well defined temperature: $$ \beta=\frac{dS}{dU} $$ A quick fix would be to extend the formula beyond the original domain of definition using the gamma function. You would then compute the temperature: $$ \beta=-\frac{\psi\left(\frac{N+U}2\right)+\psi\left(\frac{N-U}2\right)}2 $$
From the canonical ensemble perspective, you get a different relation: $$ \begin{align} U &= N\tanh\beta \\ S &= N\left[\ln(2\cosh\beta)+\beta\tanh\beta\right] \\ &= -N\left[\left(\frac{N+U}{2N}\right)\ln\left(\frac{N+U}{2N}\right)+\left(\frac{N-U}{2N}\right)\ln\left(\frac{N-U}{2N}\right)\right] \end{align} $$ As you can see, even by extending the microcanonical ensemble, the two do not agree at finite $N$. However, as $N\to\infty$, the microcanonical ensemble tends to the canonical one by Stirling's formula.
Another example would be to consider $N$ non relativistic independent particles in a box of volume $V$ in $D$ dimensions. The Hamiltonian is: $$ H = \sum\frac12p_i^2 $$ In the microcanonical ensemble, The entropy is therefore (only focus on the energy dependence): $$ W = \frac V{N!}\frac{2\pi^{ND/2}}{\Gamma(ND/2)}(2U)^{(ND-1)/2} \\ S = \frac{ND-1}2\ln U+... \\ \beta = \frac{ND-1}{2U} $$ while in the canonical ensemble: $$ Z = \frac V{N!}(2\pi T)^{ND/2} \\ F = \frac{ND}2\ln T + ... \\ U = \frac{ND}2T $$ In other words, equipartition is exact in the canonical ensemble, but is exact in the microcanonical ensemble only in the thermodynamic limit.
More generally, you can prove the equivalence by assuming extensivity. In the microcanonical ensemble, assume that as $N\to\infty$, you have the convergence: $$ \frac{S(N,uN)}{N}\xrightarrow{N\to\infty}s(u) $$ with $s,u$ the intensive values. Therefore when computing the canonical partition function: $$ \begin{align} Z &= \int e^{S-\beta E}dE \\ &\asymp \int e^{N(s(u)-\beta u)}du \\ &\asymp e^{-N\beta f} \\ \beta f(\beta) &= \inf_u(\beta u-s(u)) \end{align} $$ First apply the limiting function and then apply Laplace's method. Applying this general method to the previous example will give you the classic proof of Stirling's formula. The variational definition of the Legendre transform naturally emerges from Laplace's method on the Laplace transform. In particular, $\beta f$ is the Legendre transform of $s$. From thermodynamics, it is more familiar to view $f$ as the Legendre transform of $u$, but in statistical mechanics it the former that is more natural.
Once you have the Legendre transform, the equivalence of the ensembles automatically follow. By the variational definition: $$ \beta=\frac{ds}{du} $$ in accordance with the microcanonical definition. This applies to all thermodynamic potentials. Fluctuating number of particles gives you the grandcanonical ensemble, volume gives you Gibbs free energy and magnetic moment gives you the magnetic analogue of Gibbs free energy.
Mathematically, the key step is extensivity. In the context of probabilities, this is known as large deviation theory. In the thermodynamic limit, you have a concentration of the probability measure on a typical macroscopic state. Therefore, the detailed behavior of the distribution does not matter. More generally, the variational principles of thermodynamics follow from Varadhan's theorem. You can find an overview on the topic in "The large deviation approach to statistical mechanics" by Touchette and more details in Entropy, Large Deviations and Statistical Mechanics by Ellis.
- 17,715
- 1
- 10
- 36
The general scheme
In microcanonical approach, we search for the number of microstates $\Omega$ associated to fixed values of the thermodynamic variables $(U,V,N)$, and we use the Boltzmann relation, that stablish
$$ S(U,V,N) = k_b \ln ( \Omega (U,V,N)). $$
In terms of thermodynamics, we are thinking about the system as a closed system in this case, with well defined and fixed energy, volume and number of particles. In terms of ensembles, we choose the distribution over the system microstates as the uniform one
$$ p_j = \frac{1}{\Omega} $$
So the entropy, by information theory, is exactly the one given by Boltzmann relation
$$ S= - k_b \sum_j p_j \ln p_j = k_b\ln\Omega $$
In the canonical approach, the procedure is different. The system is not thought as being a closed one, but open and in contact with a thermal reservoir. Speaking thermodynamically, the variables now are $(T,V,N)$. These variables are the thermodynamic variables for the Helmholtz free energy $F(T,V,N)$, that is the Legendre transform of the potential $U(S,V,N)$ with respect to $S$.
In terms of ensembles, the probabilities of the states are
$$ f_j = \frac{e^{-E_j/k_b T}}{Z} $$
where $Z= \sum_j e^{-E_j/k_b T}$ is known as partition function and we have the relation
$$ F(T,V,N) = -k_bT \ln ( Z). $$
To show that this agree to previous results, we just need to do another Legendre transform of this potential, with respect to $T$. We have
$$ S = -\left (\frac{\partial F}{\partial T}\right)_{V,N} = k_b \ln (\sum p_j) - \frac UT $$
This expression would be the same as if we started with the Shannon entropy $S= k_b \sum_j f_j\ln f_j$.
Inverting the obtained relation above, we could get $T=T(S,V,N)$. We use this expression in $U=F+TS$ to get $U(S,V,N)$. Since $U$ is a strictly monotonic function of $S$, we can invert it to get $S(U,V,N)$. You can do it for many special examples and check that it works, but unfortunately there is no way to follow this path in general case (using generic expressions like we have here). But there is (at least) other two ways to prove the equivalence of the ensembles, or at least microcanonical $\to$ canonical ensemble.
Microcanonical $\to$ canonical ensemble
You could find this procedure in [1]. Actually, I like to modify a little the arguments, in a way that it fits better here.
We assume that we have a system + reservoir, that we will call "TOT". The TOT has energy $E_{TOT}$ and it is a closed system. We could describe it using the microcanonical ensemble
$$ p_{TOT} = \frac{1}{\Omega_{TOT}(E_{TOT})}. $$
When the system has energy $E_j$, the reservoir has $E_{TOT} - E_j$. It is an auxiliary system, so it could be thought as being extremely big, such that it is not practically affected by the system. Under these hypothesis, we could assume that the reservoir could also be described by the microcanonical ensemble
$$ p_{res} = \frac{1}{\Omega_{res}(E_{TOT}-E_j)} $$
By multiplicative property of probabilities, if $f_j$ is the probability to find the system in the state with energy $E_j$,
$$ p_{TOT} = f_j \, p_{res} \Rightarrow f_j = {\Omega_{res}(E_{TOT}-E_j)\over \Omega_{TOT}(E_{TOT})} $$
Using Boltzmann relation, we could write $\Omega = e^{S/k_b}$ in both places of the fraction above. In the dominator, we could use additivity to write $S_{TOT}(E_{TOT}) = S(U) + S_{res}(E_{TOT}-U)$. In the numerator, we could write, by Taylor expansion, $S_{res}(E_{TOT}-E_j) = S_{res}(E_{TOT} - U + U - E_j) = S_{res}(E_{TOT}-U) + (U-E_j)\frac 1T$, where all the high order terms vanish due to reservoir property $T=cte$. It implies that
$$ f_j = e^{\beta (U-ST)} e^{-\beta E_j}, \quad \beta = \frac{1}{k_bT}. $$
Identifying $F= U-TS$, we recover the canonical ensemble.
Meaning of the partition function
If we analyse the partition function with more care, we could find its connection with microcanonical ensemble. Note that, the expression
$$ Z(\beta) = \sum_j e^{-\beta E_j} $$
Is quite generic and does not contain much information about what the $j$ state is. If we identify the $j$ state as a macrostate with Degenerescency $\Omega(E_j)$, we could write
$$ Z(\beta) = \sum_j \Omega(E_j)e^{-\beta E_j}. $$
Usually we consider a Continuum range of possible macrostates, so in fact the expression of partition function is
$$ Z(\beta) = \int dE \, \Omega(E) e^{-\beta E}. $$
But this is the definition of the Laplace transform of the density of states $\Omega(E)$. This function, in its turn, is nothing more that the number of microstates compatible with the constraint of macroscopical energy $E$. The partition function, then, it is a kind of characteristic function for the probability density function $\Omega$ (or maybe the moment generating function of it). It means it could equally used to calculate the moments $\langle E \rangle$, giving the same result as if calculated using the density of states.
I don't know how to work out these concepts for finite systems. I would recommend you the E.T. Jaynes papers [2,3] connecting Thermodynamics with Information theory, and how he could recover any ensemble working out the Max Ent principle.
[1] H. Callen. Thermodynamics and Introduction to Thermoestatistics. Section 16, pg 350.
[2] Jaynes, E. T. (1957). Information theory and statistical mechanics. Physical review, 106(4), 620.
[3] Jaynes, E. T. (1957). Information theory and statistical mechanics. II. Physical review, 108(2), 171.
- 4,397