Before entering this lesson, students should:
Usually, (analytically) we maximize functions by setting their derivatives equal to zero. So we could maximize the fairness by $$\frac{\partial\mathcal{F}}{\partial P_i} = 0$$ $$= -k_B (\ln P_i + 1)$$ Using the formula for the fairness function, what can this tell us about $P_i$? It doesn't make much sense at all… it means $P_i = e^{-1}$.$\ddot\frown$ There is a problem with this, which is that $P_i$ can't take just any values, because these are probabilities, and they have to add up to one. This is a constraint, and to solve a constrained maximization, we use the method of Lagrange multipliers. We first define a Lagrangian (The term Lagrangian means different things in different fields. In this case, we aren't using the normal Physics meaning for Lagrangian, but rather the definition from optimization theory, since we are optimizing.): $$\mathcal{L} = \mathcal{F} + \alpha k_B\left(1-\sum_i P_i\right)$$ Note that since the added term should be zero, we haven't changed the thing we want to maximize. Now we maximize this in the same way, but we've got some extra terms that show up in our derivatives. We could, by the way, obtain our constraint by maximizing over $α$ (the Lagrange multiplier) as well as the probabilities $P_i$.
When we minimize $\mathcal{L} $, we find $$\frac{\partial\mathcal{L} }{\partial P_i} = -k_B (\ln P_i + 1) - \alpha k_B$$ $$= 0$$ $$\ln P_i + 1 = -\alpha$$ $$P_i = e^{-1-\alpha}$$ This tells us that all the states are equally probable. To find the actual probabilities, we would need to also apply the constraint: $$= 1-\sum_i P_i$$ $$= 1 - \sum_i e^{-1-\alpha}$$ $$= 1 - e^{-1-\alpha} \sum_i$$ $$= 1 - N e^{-1-\alpha}$$ $$e^{-1-\alpha} = \frac1N$$ $$P_i = \frac1N$$ Which tells us that the probability of each state is equal to one over the total number of states. This makes some degree of sense: if all states are equally probable, then it does make sense that the probability of each state must be one in $N$.
However, this doesn't really make much physical sense, since it means states with a huge energy are just as likely as states with a very small energy. The reason is because we haven't yet taken into account the energy. We did, however, succeed in demonstrating that the maximum fairness occurs when all states are equally probable, as promised.
If the energy of a system is actually constrained (as it generally is), then we should be applying a second constraint, besides the one that allows us to normalize our probabilities. $$\mathcal{L} = -k_B\sum_iP_i\ln P_i + \alpha k_B\left(1-\sum_i P_i\right) + \beta k_B \left(U - \sum_i P_i E_i\right)$$ where $\alpha$ and $\beta$ are the two Lagrange multipliers. We want to maximize this, so we set its derivatives to zero: $$\frac{\partial\mathcal{L}}{\partial P_i} = 0 $$ $$= -k_B\left(\ln P_i + 1\right) - k_B\alpha - \beta k_B E_i$$ $$\ln P_i = -1 -\alpha - \beta E_i$$ At this point, it is convenient to invoke the normalization constraint… $$\sum_i P_i = 1$$ $$1= \sum_i e^{-1-\alpha-\beta E_i}$$ $$1= e^{-1-\alpha}\sum_i e^{-\beta E_i}$$ $$e^{1+\alpha} = \sum_i e^{-\beta E_i}$$ $$Z \equiv \sum_i^\text{all states} e^{-\beta \epsilon_i}$$ $$P_i = \frac{e^{-\beta \epsilon_i}}{Z}$$ $$P_i = \frac{Boltzmann factor}{partition function}$$ At this point, we haven't yet solved for $\alpha$, and to do so, we'd need to invoke the internal energy constraint: $$U = \sum_i E_i P_i$$ $$U = \frac{\sum_i E_i e^{-\beta E_i}}{Z}\label{eq:UfromZ}$$
The partition function is a particularly useful quantity. Physically, it is nothing more than the normalization factor needed in order to compute probabilities, but in practice, finding that normalization is typically the hardest part of a calculation—once you have found all the energy eigenvalues, that is.
One interesting question is whether the partition function is intensive or extensive. To examine that question, we will look at the partition function of two combined, uncorrelated systems, similar to what we examined earlier. $$Z_A = \sum_i e^{-\beta E_i^A}$$ $$Z_B = \sum_j e^{-\beta E_j^B}$$ $$Z_{AB} = \sum_{ij} e^{-\beta \left(E_i^A + E_j^B\right)}$$ $$= \sum_{ij} e^{-\beta E_i^A}e^{-\beta E_j^B}$$ $$= \sum_{i} \sum_j e^{-\beta E_i^A}e^{-\beta E_j^B}$$ $$= \sum_{i} e^{-\beta E_i^A} \sum_j e^{-\beta E_j^B}$$ $$= \left(\sum_i e^{-\beta E_i^A}\right) \left(\sum_j e^{-\beta E_j^B}\right)$$ $$= Z_A Z_B$$
So the partition function of two uncorrelated systems is multiplied rather than added. This means that the log of the partition function is itself extensive! It will turn out to be a thermodynamic state function that you have already encountered, as we will see tomorrow.
One consequence of this logarithmic extensivity is that if you have $N$ identical non-interacting systems with uncorrelated probabilities, you can write their combined partition function as $$Z = Z_1^N$$ where $Z_1$ is the partition function of a single non-interacting system.