Multivariable Calculus
Contents
11. Multivariable Calculus¶
11.1. First order partial derivatives¶
We can consider the rate of change of the function, however since it is a function of two variables, we can see there are two possible kinds of derivative we can find:
along a curve parallel to the \(x\)-axis, by holding \(y\) constant and differentiating with respect to \(x\).
along a curve parallel to the \(y\)-axis, by holding \(x\) constant and differentiating with respect to \(y\).
We call these partial derivatives, denoted here by:
note that the notation \(\partial\) is distinct from the \(\mathrm{d}\) used for one variable calculus. It is partial because we consider only variations in one of the two variables here. The results show the local rate of change parallel to each axis at a point \((x,\,y)\).
The plot shown in Fig. 11.1 is of a function, \(f(x,\, y)= x^3 - y^3 - 2xy + 2\), on which curves marked on the surface for lines of constant \(x,\,y\):
Just like the one variable derivative, there is a limit definition for partial derivatives for a function \(f = f(x,\,y)\):
Problems
Calculate all the first partial derivatives \(\partial/\partial x,\, \partial/\partial y\) for the following functions:
1. \(f(x,\,y) = 3x^3 y^2 + 2 y \)
2. \(f(x,\,y) = x^2 \ln(3x+y)\)
3. \(z(x,\,y) = \ln(x+y^2\sin(x))\)
Solutions
1. \(f(x,\,y) = 3x^3 y^2 + 2 y \)
2. \(f(x,\,y) = x^2 \ln(3x+y)\)
3. \(z(x,\,y) = \ln(x+y^2\sin(x))\)
11.2. Second order partial derivatives¶
The second partial derivatives with respect to \(x\) and \(y\) are denoted as follows:
The notation can also be extended to mixed second partial derivative, where we take the \(x\) and the \(y\) partial derivative:
Notice that we work from the inside out, as with function composition and matrix multiplication. For any well behaved, differntiable and continuous function, these two expressions are always equal. The proof of this result (called Schwarz’s theorem) is quite involved and is beyond the scope of this course.
As an example, lets calculate all second partial derivatives of the function \(f(x,y)=3x^3y^2+2y\)
A Common Mistake
Lets look at the function \(f(x,\, y) = x^2 y^3 + x + y\) at the point \((1,\, 1)\), calculating the mixed partial derivative:
we could argue that we follow the process:
Put \(y=1\) into the function and then differentiate with respect to \(x\) to obtain:
Then put \(x=1\) into this function and differentiate with respect to \(y\) to obtain:
The result is wrong, because we took \(y=1\) before differentiating with respect to \(y\) - to avoid mistakes of this nature, we should always perform differentiation first and only substitute in the values in the very last step. The correct result is:
11.3. Notation for partial derivatives¶
Partial derivatives are commonly denoted using subscript notation:
For mixed derivatives the order or subscripts is from left to right:
You will likely come across yet more alternative notations in the literature, another common one being:
11.4. Multivariable chain rule¶
We now consider a function \(f(x,\,y)\) subjected to small variations in both \(x\) and \(y\) as shown in Fig. 11.2.
Loosely speaking, the total change in the function \(f(x,\, y)\) is the sum of changes due to each variable:
If we now suppose that we parameterise \(x=x(u,\, v)\) and \(y=y(u,\, v)\) then we may similarly write \(\Delta x\) and \(\Delta y\) as the sum of changes due to variables \(u\) and \(v\):
Holding \(v\) constant in this expression (\(\Delta v=0\)) gives:
Holding \(u\) constant in this expression (\(\Delta u=0\)) gives:
This was a somewhat hand-waving argument, but the results are valid in the limit \(\Delta u\rightarrow 0, \, \Delta v\rightarrow 0\) and can be proved using the limit definition of the derivative and from this we obtain the multivariable chain rule.
If \(f = f(x,\, y)\) where \(x=x(u,\, v)\) and \(y=y(u,\, v)\) then:
Many student’s first go at encountering this rule often think that it “can’t be right”, because replacing the partial derivatives with differences gives:
which suggests the result \(\Delta f = 2\Delta f\). However, this misunderstanding comes from ambiguity in writing \(\Delta f\).
On the left-hand side it means changes in \(f\) dues to variations in both \(x\) and \(y\), whilst in \(f_x\) and \(f_y\) the changes are due to only one of these variables, whilst the other is held constant. Written formally:
The lesson here is - it is dangerous to treat partial derivatives as fractions!
11.4.1. Dependency trees¶
The multivariate chain rule can be illustrated as a dependency tree, in Fig. 11.3, where we examine \(f(x,\, y)\) with \(x = x(u,\, v)\) and \(y = y(u,\, v)\):
For instance, if we follow the dependency routes involving \(u\), we get \(f_u = f_x\, x_u + f_y\, y_u\).
We can do the same thing for the second derivatives (a repeat application of the chain rule), in Fig. 11.4.
Worked Examples
1. Lets look at \(f(x,y)=x^2 y+y^2\), if we have \(x = u+v\) and \(y = u-v\), then we can calculate \(f_u,\, f_v\) using dependency trees:
meaning that:
Putting these results together:
2.
To find \(f_x\), we can let \(u = x^2 + 2xy\), \(v = x-y\), then:
11.4.2. Total differential¶
Definition
The total change in a function \(f = f(x,\, y,\, \dots)\) based on the changes in each of its variables can be expressed as the total differential:
we can therefore express this as a total derivative, in terms of one of the variables:
If we have a differential in the form:
such that:
then we call this an exact or perfect differential.
11.5. Stationary Points¶
Obviusly we can translate our single variable calculus toolkit for stationary points into a multivbariable toolkit, the key caveat now being that:
1. when \(\displaystyle \frac{\partial f}{\partial x}=0\) the function is stationary (flat) with respect to the \(x\)-axis,
2. when \(\displaystyle \frac{\partial f}{\partial y}=0\) the function is stationary (flat) with respect to the \(y\)-axis.
So it is possible that a function can be stationary w.r.t. to one axes and not another, or have one kind of stationary point along one axes and have a different kind along another (e..g maxima in \(x\), minima in \(y\)).
Recall from the multivariate chain rule:
then it is apparent that when both \(\displaystyle \frac{\partial f}{\partial x}=0\) AND \(\displaystyle \frac{\partial f}{\partial y}=0\), then the instantaneous rate of change of \(f\) is zero in any direction.
Worked Example
Think about the function in Fig. 11.1, \(f(x,\, y) = x^3 - y^3 - 2xy + 2\) can be found by solving \(f_x = f_y = 0\) simultaneously:
In general, it may be very difficult (or impossible!) to solve nonlinear equations by hand, and so we would need to resort to numerical methods. In this case, however, we can proceed by rearranging one of the equations to substitute into the other, to obtain
This equation has solutions \(x=0\) and \(x=-\frac{2}{3}\), as well as a complex conjugate pair of solutions \(\frac{1}{3}(1\pm\sqrt{3}i)\), which we will discard here.
Hence, the stationary points are \((0,\, 0,\, 2)\) and \(\displaystyle \left(-\frac{2}{3},\, \frac{2}{3},\, \frac{62}{27}\right)\), where we write \((x,\, y,\, f)\)
11.5.1. Classification of Stationary Points¶
For a function \(f(x,\,y)\), we might expect to classify stationary points using \(f_{xx}\) and \(f_{yy}\). After all:
\(f_{xx}\) tells us the function concavity parallel to the \(y\) axis
\(f_{yy}\) tells us the function concavity parallel to the \(x\) axis
If the function is concave up in both the \(x\) and \(y\) directions through a stationary point, then intuition tells us that this is a local minimum.
If the function is concave down in both the \(x\) and \(y\) directions through a stationary point, then our intuition tells us that this is a local maximum.
We can examine this through some example functions,
However, a local maximum/minimum is not the only type of stationary point that a surface \(f(x,\,y)\) can have. For instance, a surface may have a stationary point that sits where the function is concave upwards with respect to one axis and concave downwards with respect to the other axis. This type of point is called a saddle point (it looks like a saddle for a horse). The figure below shows an example:
We conclude that at a stationary point, if \(f_{xx}\) and \(f_{yy}\) are opposite sign, then the point is a saddle point. However, the converse is not necessarily true! It turns out that we can have a saddle point where \(f_{xx}\) and \(f_{yy}\) are both the same sign (or even when they are both zero). An example is illustrated in the figure below. In this case the saddle point is not aligned squarely with the \((x,\,y)\) coordinate directions.
So, it turns outs that the condition for a maximum/minimum is more complicated than we first thought! A valid classification algorithm is presented in the box below.
The result can be proved by utilising a multivariate Taylor series expansion about the stationary point and retaining terms only up to quadratic order so that the shape of the function may be inferred from the properties of a quadratic. Neglecting the higher order terms in the expansion is justified in the limit approaching the stationary point. We have not studied the multivariate chain rule, so the proof is not presented here.
11.5.2. Hessian Matrix¶
At a stationary point, \(f_x(x_0,y_0)=f_y(x_0,y_0)\), we calculate the determinant of the Hessian matrix at \(H(x_0,\,y_0)\):
This can have a few different outcomes:
If \(\det(H(x_0,\,y_0))>0\) then the point is a local max/min, depending on the signs of \(f_{xx}\) and \(f_{yy}\).
If \(\det(H(x_0,\,y_0))<0\) then the point is a saddle.
If \(\det(H(x_0,\,y_0))=0\) then the test is inconclusive and further analysis is needed.
Worked Example
Lets classify the stationary points of the function \(f=x^3-y^3-2xy+2\), we already found that the stationary points are located at \((0,0,2)\) and \(\displaystyle \left(-\frac{2}{3},\frac{2}{3},\frac{62}{27}\right)\).
Calculating the Hessian determinant components \(f_{xx}=6x, \quad f_{yy}=-6y, \quad f_{xy}=f_{yx}=-2\) and therefore:
\(\det(H(0,0))=-4<0\) so the origin is a saddle point.
\(\det\biggr(H\biggr(-\frac{2}{3},\frac{2}{3}\biggr)\biggr)=12>0\) and \(f_{xx}\left(-\frac{2}{3},\frac{2}{3}\right)<0\), so the point \(\left(-\frac{2}{3}\frac{2}{3}\right)\) is a local maximum.
A contour plot of the function, shown in Fig. 11.9, confirms these findings.
Practice questions
1. Find the stationary points for the surface described by \(f(x,y) = x^2 + 3xy^2 + 2y^3\).
2. Find the stationary points for the surface described by \(f(x,y) = x^3 + y^3 - 3xy - 4\).
Solutions
1. Finding the partial derivatives:
To find the points which satisfy \(f_x = f_y = 0\), we have:
Thus we find \(y=0\) as a valid stationary point, this will therefore correspond to \(x=0\).
Another stationary point will be found to satisfy by \(y = -x\), hence we have to solve:
which has solutions of \(x = 0\) (hence \(y = 0\)) or \(x = -2/3\) and therefore from \(y = -x\) corresponds to \(y = 0\) or \(y = 2/3\).
Therefore we can collect together these points as:
To find the nature of these SP’s, we need the Hessian determinant:
and hence at each point:
To examine what happens at \(A\), we can look at \(f_{x}\) and \(f_{y}\) for points \((\pm \delta x, \,\pm\delta y)\) for \(\delta x, \delta y \ll 1\)
In the \(y\) direction there is a point of inflection and in the \(x\) direction there is a minima, there is a saddle point at \(A\).
We can also see this from the contour plot:
2. Finding the partial derivatives:
To find the points which satisfy \(f_x = f_y = 0\), we have:
hence this means that \(y = x^2\) from the first equation, which pluggin into the second gives:
which means that \(x = 0\) or \(x = 1\) and since \(y = x^2\) we find two stationary points:
To find the nature of these SP’s, we need the Hessian determinant:
and hence at each point:
given that \(f_{xx}|_B,\, f_{yy}|_B > 0\) then this a minima.
We can also see this from the contour plot: