Suppose first that you want to extremize a function $f$ of two variables subject to the constraint $g=\hbox{constant}$. In two dimensions, level sets are curves, and we can introduce a coordinate $v$ along the level curves of $g$. Then $v$ and $g$ can be used as coordinates, at least in a small region about any point. Thus, \begin{equation} df = \Partial{f}{g}\,dg + \Partial{f}{v}\,dv \end{equation} The condition that $f$ be extremized at a point $P$ on a given level curve of $g$ is precisely that \begin{equation} \Partial{f}{v}\bigg|_P = 0 \end{equation} Thus, at $P$, we have \begin{equation} df = \lambda \,dg \label{dlagrange} \end{equation} where \begin{equation} \lambda = \Partial{f}{g}\bigg|_P \end{equation} Equation ($\ref{dlagrange}$) can be used to find the point(s) $P$ on the level curve where $f$ is extremized: Simply expand both sides in terms of the original variables, and realize that corresponding partial derivatives must therefore be proportional, that is \begin{equation} \Partial{f}{x^i} = \lambda \Partial{g}{x^i} \end{equation} These two equations (together with the constraint condition) can be solved for $x^i$ (and $\lambda$); this is the method of Lagrange multipliers.
It is straightforward to generalize this procedure to higher dimensions. If $f$ is a funciton of $n$ variables, and the constrain function is also, then the level sets of $g$ are $(n-1)$-dimensional surfaces, on which we can introduce the $n-1$ coordinates $v^i$; $\{g,v^i\}$ can therefore be used as (local) coordinates. Thus, \begin{equation} df = \Partial{f}{g}\,dg + \sum\limits_i \Partial{f}{v^i}\,dv^i \end{equation} and the condition that $f$ be extremized at a point $P$ on a given level surface of $g$ is that \begin{equation} \Partial{f}{v^i}\bigg|_P = 0 \end{equation} for each $i$. Thus, at $P$, we again have ($\ref{dlagrange}$); only the number of “components” of this equation has changed.
We can further generalize this procedure to multiple constraints, involving constraint functions $g^j$. The intersection of $k$ such constraint surfaces will in general be an $(n-k)$-dimensional surface, on which we can introduce $n-k$ coordinates $v^i$, and we must now use $\{g^j,v^i\}$ as coordinates, leading to \begin{equation} df = \sum\limits_j \Partial{f}{g^j}\,dg^j + \sum\limits_i \Partial{f}{v^i}\,dv^i \end{equation} The condition that $f$ be extremized on this intersection surface is the same as before, namely that the second term above vanish, but now we obtain \begin{equation} df = \sum\limits_j \lambda^j dg^j \end{equation} at the point $P$. Thus, the differential of $f$ is a linear combination of the differentials of the constraint functions, a condition which must also hold for each partial derivative. The resulting system of equations can be solved for $P$ (and $\lambda^j$), at least in principle.