How do you compute a derivative of a quantity that depends on a single variable? By taking the ratio of small changes in the quantity to small changes in the variable. But what if the quantity depends on several variables, such as the temperature in the room? Use the same strategy — but the result will depend on which direction you go.
Suppose therefore that you are given a quantity $T$ that depends on position in space. We could express that position in terms of rectangular coordinates $(x,y,z)$, but let's save that for later. To find the derivative of $T$ at a given point and in a given direction, we must first specify the point and the direction. We know how to do that!
A point in space can be described in terms of the position vector $\rr$ from the origin to the given point. And the direction is determined by considering any curve through the given point; a small change in position along the curve is, of course, described by $d\rr$.
So consider the small change $dT$ in $T$ along the curve. As $ds=|d\rr|$ shrinks to $0$, so does $dT$; the derivative of $T$ along the curve is the (limiting value of the) ratio of the small quantities $dT$ and $ds$.
Now consider computing this derivative in all possible directions. There will be one direction in which the derivative is as large as possible. We define the gradient of $T$, written $\grad T$, to be the vector whose direction is the direction in which $T$ increases the fastest, and whose magnitude is the derivative of $T$ in that direction. This construction yields the gradient of $T$ at a given point, and we can repeat the process at any point; the gradient of $T$ is a vector field.
How much does $T$ change in an arbitrary direction? Suppose we have a curve through the given point that goes in this new direction. Since derivatives are linear, the rate of change along this new curve is just the projection of $\grad T$ along the curve, namely \begin{equation} \frac{dT}{ds} = \grad T \cdot \frac{d\rr}{|d\rr|} \end{equation} where $d\rr$ and $ds$ now refer to the new curve, and are therefore different than before. Remembering that $|d\rr|=ds$, we can rewrite this expression as \begin{equation} dT = \grad T \cdot d\rr \end{equation} which we refer to as the Master Formula, and which can also be taken as the definition of the gradient.
We can now use our knowledge of the multivariable differential $dT$ to obtain a formula for $\grad T$ in rectangular coordinates. We have \begin{equation} dT = \Partial{T}{x}\,dx + \Partial{T}{y}\,dy + \Partial{T}{z}\,dz \end{equation} and \begin{equation} d\rr = dx\,\xhat + dy\,\yhat + dz\,\xhat \end{equation} which leads to \begin{equation} \grad T = \Partial{T}{x}\,\xhat + \Partial{T}{y}\,\yhat + \Partial{T}{z}\,\xhat \end{equation} as was already asserted in § {Gradient}.