- Basic familiarity with differential calculus.
Using Differentials to Compute Derivatives
Derivatives are instantaneous rates of change, which are in turn the ratios of small changes. There are two traditional notations for derivatives, which you have likely already seen.
Newton: In this notation, due to Newton, the primary objects are functions, such as $f(x)=x^2$, and derivatives are written with a prime, as in $f'(x)=2x$.
Leibniz: In this notation, due to Leibniz, the primary objects are relationships, such as $y=x^2$, and derivatives are written as a ratio, as in $\frac{dy}{dx}=2x$.
Both notations are in common usage, and both notations work fine for functions of a single variable. However, Leibniz notation is better suited to situations involving many quantities that are changing, both because it keeps explicit track of which derivative you took (“with respect to $x$”), and because it emphasizes that derivatives are ratios. Among other things, this helps you get the units right; mph are a ratio of miles to hours!
Both of these notations distinguish between dependent quantities ($f(x)$ or $y$) and the independent variable ($x$). However, in the real world one doesn't always know in advance which variables are independent. We therefore go one step further, and express derivatives in terms of differentials.
The intuitive idea behind differentials is to consider the small quantities “$dy$” and “$dx$” separately, then take their ratio. So rather than either of the above expressions, we write \begin{equation} dy = 2x\,dx \label{zapex} \end{equation} You can think of (\ref{zapex}) as the numerator of Leibniz notation, or as shorthand for a limit argument, or in terms of differential forms, or nonstandard analysis, or …; it doesn't matter.
The beauty of this approach is that differentiation is easy once you have convinced yourself of a few basic rules. The basic differentiation formulas in differential notation are: \begin{eqnarray} d\left(u^n\right) &=& nu^{n-1} \,du \\ d\left(e^u\right) &=& e^u \,du \\ d(\sin u) &=& \cos u \,du \\ d(\cos u) &=& -\sin u \,du \\ d(\ln u) &=& \frac{1}{u} \>du \\ d(\tan u) &=& \frac{1}{\cos^2u} \>du \\ \noalign{\bigskip} d(u+cv) &=& du + c \,dv \\ d(uv) &=& u \,dv + v \,du \\ d\left(\frac{u}{v}\right) &=& \frac{v \,du - u \,dv}{v^2} \end{eqnarray} where $c$ is a constant.
You may have noticed that this list does not contain either the chain rule or a rule for inverse functions. In differential notation, these rules aren't necessary! In Leibniz notation, the chain rule says that \[ \frac{dy}{dx} = \frac{dy}{du} \> \frac{du}{dx} \] and the rule for the derivative of inverse functions is \[ \frac{dx}{dy} = \frac{1}{dy/dx} \] In differential notation, both of these statements follow immediately from the ordinary rules for manipulating fractions; there is no need to remember them separately! Similarly, implicit differentiation can be accomplished in differential notation simply by “zapping” every term in an equation with $d$, that is, by taking the differential (not the derivative) of both sides of the equation, using the above rules.
As a simple example, consider the problem of finding the slope of the tangent line to a circle at an arbitrary point. We have \begin{equation} x^2 + y^2 = a^2 \end{equation} with $a$ constant, and zapping each term with $d$ yields \begin{equation} 2x\,dx + 2y\,dy = 0 \end{equation} from which it is easy to derive \begin{equation} \frac{dy}{dx} = -\frac{x}{y} \end{equation} which can then be evaluated at the desired point.
Note the shift in emphasis when using differentials:
- We didn't solve for $y$ (or $x$); no attempt was made to identify dependent and independent variables.
- It was essential to keep track of which derivatives we were taking, by including $dx$ or $dy$ in the result of zapping each term with $d$. (There is no such operation as “take the derivative”; taking the derivative with respect to $x$ is not the same as taking the derivative with respect to $y$!)
- The answer was not a differential, but rather the ratio of two differentials.
A nice mnemonic to help you through such computations, especially at first, is to think of terms involving $d$ as small. Derivatives are the ratios of small quantities, but they are not themselves small. And each term in an equation must have the same character, big or small. Put differently, the $d$s must balance; (infinitesimally) small quantities can never be equal to (finite) big quantities.