Geometric Definition

How do you compute the derivative of a quantity that depends on a single variable? By taking the ratio of small changes in the quantity to small changes in the variable. But what if the quantity depends on several variables, such as the temperature in the room? Use the same strategy – but the result will now depend on which direction you go.

Suppose therefore that you are given a quantity $T$, such as temperature, that depends on position in space. We could express that position in terms of rectangular coordinates $(x,y,z)$, but let's save that for later. To find the derivative of $T$ at a given point and in a given direction, we must first specify the point and the direction. We know how to do that!

A point in space can be described in terms of the position vector $\rr$ from the origin to the given point. And the direction is determined by considering any curve through the given point; a small change in position along the curve is, of course, described by $d\rr$.

So consider the small change $dT$ in $T$ along the curve. As $ds=|d\rr|$ shrinks to $0$, so does $dT$; the derivative of $T$ along the curve is the (limiting value of the) ratio of the small quantities $dT$ and $ds$.

Now consider computing this derivative in all possible directions. There will be one direction (“uphill”) in which the derivative is as large as possible. We define the gradient of $T$, written $\grad T$, to be the vector whose direction is that in which $T$ increases the fastest, and whose magnitude is the derivative of $T$ in that direction. This construction yields the gradient of $T$ at a given point, and we can repeat the process at any point; the gradient of $T$ is a vector field.

How much does $T$ change in an arbitrary direction? Suppose we have a curve through the given point that goes in this new direction. Since derivatives are linear, the rate of change along this new curve is just the projection of $\grad T$ along the curve, namely $$\frac{dT}{ds} = \grad T \cdot \frac{d\rr}{|d\rr|}$$ where $d\rr$ and $ds$ now refer to the new curve, and are therefore different than before. Remembering that $|d\rr|=ds$, we can rewrite this expression as $$dT = \grad T \cdot d\rr \label{Master}$$ which we refer to as the Master Formula. The Master Formula can be taken as as the geometric definition of the gradient.

Coordinate Expression

We can now use our knowledge of the multivariable differential $dT$ to obtain a formula for $\grad T$ in rectangular coordinates. The chain rule for a function of several variables takes the form $$dT = \Partial{T}{x}\,dx + \Partial{T}{y}\,dy + \Partial{T}{z}\,dz$$ in which each term is a product of two factors, labeled by $x$, $y$, and $z$. This looks very much like a dot product! Separating out the pieces, we have $$dT = \left( \Partial{T}{x}\,\xhat + \Partial{T}{y}\,\yhat + \Partial{T}{z}\,\zhat \right) \cdot (dx\,\xhat + dy\,\yhat + dz\,\zhat)$$ and we know that $$d\rr = dx\,\xhat + dy\,\yhat + dz\,\xhat$$ which leads to $$\grad T = \Partial{T}{x}\,\xhat + \Partial{T}{y}\,\yhat + \Partial{T}{z}\,\xhat .$$

Recall that $dT$ represents the infinitesimal change in $T$ when moving to a “nearby” point. What information do you need in order to know how $T$ changes? You must know something about how $T$ behaves, where you started, and which way you went. The Master Formula organizes this information into two geometrically different pieces, namely the gradient, containing generic information about how $T$ changes, and the vector differential $d\rr$, containing information about the particular change in position being made.

Geometric Interpretation

What does the gradient mean geometrically? Along a particular path, $df$ tells us something about how $f$ is changing. But the Master Formula tells us that $df=\grad f\cdot d\rr$, which means that the dot product of $\grad f$ with a vector tells us something about how $f$ changes along that vector. So let $\Hat w$ be a unit vector, and consider $$\grad{f} \cdot \Hat w = |\grad{f}| \> |\Hat w| \cos\theta = |\grad{f}| \cos\theta$$ which is clearly maximized by $\theta=0$. Thus, as claimed above, the direction of $\grad{f}$ is just the direction in which $f$ increases the fastest, and the magnitude of $\grad{f}$ is the rate of increase of $f$ in that direction (per unit distance, since $\Hat w$ is a unit vector). If you visualize the value of the scalar field $f$ as represented by color, then the gradient points in the direction in which the rate of change of the color is greatest.

You can also visualize the gradient using the level surfaces on which $f(x,y,z)={\rm const}$. (In two dimensions there is the analogous concept of level curves, on which $f(x,y)={\rm const}$.) Consider a small displacement $d\rr$ that lies on the level surface, that is, start at a point on the level surface, and move along the surface. Then $f$ doesn't change in that direction, so $df=0$. But then $$0 = df = \grad{f} \cdot d\rr = 0 \label{fconst}$$ so that $\grad{f}$ is perpendicular to $d\rr$. Since this argument works for any vector displacement $d\rr$ in the surface, $\grad{f}$ must be perpendicular to the level surface.

If you prefer working with derivatives instead of differentials, consider a curve $\rr(u)$ that lies in the level surface. Now simply divide ($\ref{fconst}$) by $du$, obtaining $$0 = \frac{df}{du} = \grad{f} \cdot \frac{d\rr}{du} = 0$$ so that $\grad{f}$ is perpendicular to the tangent vector $\frac{d\rr}{du}$ (which is just the velocity vector if the parameter $u$ represents time). Again, this argument applies to any curve in the level surface, so $\grad{f}$ must be perpendicular to every such curve. In other words, $\grad{f}$ is perpendicular to the level surfaces of $f$: $$\grad{f} \perp \{f(x,y,z)={\rm const}\} .$$

This orthogonality is shown for the case of level curves in the figure at the right, which shows the gradient vector at several points along a particular level curve among several. You can think of such diagrams as topographic maps, showing the “height” at any location. The magnitude of the gradient vector is greatest where the level curves are close together, so that the “hill” is steepest. It is in this sense (only) that the gradient points “uphill.”

An alternative way of seeing this orthogonality is to recognize that, since the gradient is a derivative operator, its value depends only on what is happening locally. If you zoom in close enough to a given point, the level surfaces are parallel, and the gradient points in the direction from one level surface to the next.

Other Coordinates

The master formula can be used to derive formulas for the gradient in other coordinate systems. We illustrate the method for polar coordinates.

In polar coordinates, we have $$df = \Partial{f}{r}\,dr + \Partial{f}{\phi}\,d\phi$$ and of course $$d\rr = dr\,\rhat + r\,d\phi\,\phat . \label{dr2}$$ Comparing these expressions with the Master Formula (\ref{Master}), we see immediately that we must have $$\grad f = \Partial{f}{r}\,\rhat + {{1}\over{r}}\Partial{f}{\phi}\,\phat . \label{gradpolar}$$ Note the factor of ${{1}\over{r}}$, which is needed to compensate for the factor of $r$ in (\ref{dr2}). Such factors are typical for the component expressions of vector derivatives in curvilinear coordinates.

Why would one want to compute the gradient in polar coordinates? Consider the computation of $\grad\,\left({\ln\sqrt{x^2+y^2}}\right)$, which can done by brute force in rectangular coordinates; the calculation is straightforward but messy, even if you first use the properties of logarithms to remove the square root. Alternatively, using ($\ref{gradpolar}$), it follows immediately that $$\grad\,\left({\ln\sqrt{x^2+y^2}}\right) = \grad\,({\ln r}) = {1\over r}\,\rhat .$$

Exactly the same construction can be used to find the gradient in other coordinate systems. For instance, in cylindrical coordinates we have $$dV = \Partial{V}{r} \,dr + \Partial{V}{y} \,d\phi + \Partial{V}{z} \, dz$$ and since in cylindrical coordinates $$d\rr = dr\,\rhat + r\,d\phi\,\phat + dz\,\zhat$$ we obtain $$\grad V = \Partial{V}{r} \,\rhat + \frac{1}{r} \Partial{V}{\phi} \,\phat + \Partial{V}{z} \,\zhat .$$ This formula, as well as similar formulas for other vector derivatives in rectangular, cylindrical, and spherical coordinates, are sufficiently important to the study of electromagnetism that they can, for instance, be found on the inside front cover of Griffiths' textbook, Introduction to Electrodynamics, and can also be found here.

Product Rule

Like all derivative operators, the gradient is linear (the gradient of a sum is the sum of the gradients), and also satisfies a product rule $$\grad(fg) = (\grad{f})\,g + f\,(\grad{g}) .$$ This formula can be obtained either by working out its components in, say, rectangular coordinates, and using the product rule for partial derivatives, or directly from the product rule in differential form, which is $$d(fg) = (df)\,g + f\,(dg) .$$

Views

New Users

Curriculum

Pedagogy

Institutional Change

Publications