Electrostatic Energy
How much work is done in assembling a collection of $n$ point charges? Work is force times distance, which in this context takes the form \begin{eqnarray*} W = \int \FF\cdot d\rr = -q \int \EE\cdot d\rr = q \>\>\Delta V \end{eqnarray*} Moving the first charge requires no work — since there is no electric field. The second charge needs to be moved in the (Coulomb) field of the first, the third in the field of the first two, and so on. Continuing in this manner, we see that the work done in assembling the charges is \begin{eqnarray*} W &=& \frac{1}{4\pi\epsilon_0} \sum_{i=1}^n\sum_{j=i+1}^n \frac{q_i q_j}{|\rr_i - \rr_j|}\\ &=& \frac{1}{8\pi\epsilon_0} \sum_{i=1}^n\sum_{ {j=1},{j\ne i} }^n \frac{q_i q_j}{|\rr_i - \rr_j|} \end{eqnarray*} The advantage of the second expression, in which each term is double-counted, is that it can be rewritten in the form \begin{eqnarray*} W = \frac12 \sum q_i V(\rr_i) \end{eqnarray*} where $V(\rr_i)$ is the potential at the location $\rr_i$ of the $i$th charge due to all the other charges. This expression in turn generalizes naturally to a continuous charge distribution (but see the discussion in Griffiths about some subtleties in this limit), for which it becomes \begin{eqnarray*} W = \frac12 \int \rho V \,d\tau \end{eqnarray*}
*Our derivation here follows Griffiths, sec 2.5, p. 96-106.*
We now have an expression for the energy contained in a charge distribution, expressed in terms of the charge density $\rho$ and the potential $V$. But each of these quantities can be expressed in terms of the electric field, since $\EE=-\grad V$ and $\grad\cdot\EE=\rho/\epsilon_0$ from Gauss' Law. We can therefore rewrite the energy in terms of the electric field alone. Starting from \[ \rho V = (\epsilon_0 \grad\cdot\EE) V \] we are reminded of the product rule for the divergence (see § {Product Rules}), namely \begin{eqnarray*} \grad\cdot(V \EE) &=& \grad V \cdot\EE + V \,\grad\cdot\EE \end{eqnarray*} which in turn suggests integrating by parts. Doing so yields \begin{eqnarray*} W &=& \frac{\epsilon_0}{2} \int (\grad\cdot\EE) \> V \,d\tau \\ &=& \frac{\epsilon_0}{2} \int \left( -\grad V\cdot\EE + \grad\cdot(V\EE) \right) \,d\tau \\ &=& \frac{\epsilon_0}{2} \left( \int |\EE|^2 \,d\tau + \int V\,\EE\cdot d\AA\right) = \frac{\epsilon_0}{2} \int |\EE|^2 \,d\tau \end{eqnarray*} where we have used the Divergence Theorem to obtain the surface integral in the last line, which evaluates to zero assuming that $\EE$ obeys reasonable falloff conditions at infinity.