Definitions

In this section we deal with functions of more than one variable. In the single-variable case the function $f$ assigns a number $f(x)$ (the output) to each real number $x$ (the input) in the domain of $f$ . In the multivariate case, a function of several variables, $f$ assigns a number $f\left(x_{1}, x_{2}, \cdots, x_{n}\right)$ (output) to the real numbers $\left(x_{1}, x_{2}, \cdots, x_{n}\right)$ (input). A physical example of a function of three variables is pressure where $p(x, y, z)$ gives the pressure at a point with coordinates $(x, y, z)$ .

An important interpretation of the derivative of $f$ in the single-variable case is that of rates of change: how does $f(x)$ change with $x$ . We want to extend this interpretation to functions of several variables. Let us take the example of a function of two variables, $x$ and $y$ , i.e. $f=f(x, y)$ . Now we can ask questions on the rate at which the function is changing at a point with coordinates $(a, b)$ when $x$ is varying but $y$ is fixed, or vice-versa. The aforementioned rates are given by partial derivatives; they give the slopes in the positive $x$ direction and positive $y$ direction. This is contrasted with the directional derivative which gives slope in any direction.

Consider a function $f=f(x, y)$ given as follows

f(x, y)=x^{2}+y^{2}

and suppose we want to calculate the rate of change of $f$ at $(a, b)$ with varying $x$ while holding $y$ fixed. Since we are calculating the rate at a particular point $(a, b)$ then $y$ is fixed at $b$ which renders a function of only one variable $x$ :

g(x)=f(x, b)=x^{2}+b^{2},

whose derivative is $g^{\prime}(x)=2 x$ (which is equal to $g(a)=2 a$ ). We refer to $g^{\prime}(x)$ as the partial derivative of $f$ wrt $x$ and denote it by:

\frac{\partial f}{\partial x}, \text { or } f_{x}

The partial derivative of $f$ wrt $y$ while holding $x$ fixed is $\partial f / \partial y$ or $f_{y}$ . . These are usually denoted by a 'curly dee': $\partial f / \partial x$ and $\partial f / \partial y$ denote the partial derivatives of $f$ with respect to $x$ and $y$ , respectively. This is known as the Leibniz notation. Alternative notation for $\partial f / \partial x$ and $\partial f / \partial y$ is given by $f_{x}$ and $f_{y}$ , respectively.

Note that sometimes (particularly in thermodynamics) the following notation is used to denote the partial derivatives wrt $x$ and $y$

\left(\frac{\partial f}{\partial x}\right)_{y}, \quad\left(\frac{\partial f}{\partial y}\right)_{x},

where, here, the subscripts denote the variable treated as a constant.

We now use the definition of the derivative of a single-variable function to define the partial derivatives of a function of two variables, $f(x, y)$ . The partial derivative of $f(x, y)$ with respect to $x$ while keeping $y$ constant is the function $f_{x}(x, y)$ defined as follows,

f_{x}(x, y)=\lim _{\Delta x \rightarrow 0} \frac{f(x+\Delta x, y)-f(x, y)}{\Delta x} .

and the partial derivative of $f(x, y)$ wrt $y$ while keeping $x$ constant is the function $f_{y}(x, y)$ defined as,

f_{y}(x, y)=\lim _{\Delta y \rightarrow 0} \frac{f(x, y+\Delta y)-f(x, y)}{\Delta y}

Going back to the example, the partial derivative of $f$ wrt $x$ is

f_{x}=2 x ;

which we obtain by differentiating wrt $x$ and treating $y$ as constant. Similarly, $f_{y}$ is obtained by differentiating wrt $y$ and treating $x$ constant,

f_{y}=2 y \text {. }

To a function of one variable $f$ one can associate the graph $y=f(x)$ which is a curve in the $x-y$ plane. For a function of two variables, we consider a surface given by $z=f(x, y)$ , where $z$ is the height of the surface above the $z=0$ plane. For this eqn, $z=f(x, y)$ gives a paraboloid as shown:

height of the surface above the z plane

A graph of the function $z=f(x, y)$ where $f(x, y)$ (given by $f(x, y)=x^{2}+y^{2}$ ) gives the height of the surface above the $z=0$ plane.

Contour lines are shown on the surface plot; these are the colour lines that trace circles. The contour lines or level curves of the function $z=f(x, y)$ are two-dimensional curves satisfying $z=k$ where $k$ is any number. Along each contour therefore, the height of the surface is the same. The contours of the function plotted here are also shown in the $x-y$ plane, with each contour curve labelled with the corresponding $k$ value. Note that contours are discussed in more detail later.

contour lines

The contour lines of the function $f(x, y)$ given by the equation which satisfy $f(x, y)=k$ . The contour lines are labelled with their $k$ value corresponding to the height of the surface.

Higher derivatives

For a function of two variables, there are two first-order derivatives, $f_{x}$ and $f_{y}$ . As with functions of one variable, we can compute second-order derivatives; these are denoted by,

\frac{\partial}{\partial x}\left(\frac{\partial f}{\partial x}\right) \text { or } \frac{\partial^{2} f}{\partial x^{2}} \text { or } f_{x x} .

Similarly for second-order derivatives wrt $y$ ,

\frac{\partial}{\partial y}\left(\frac{\partial f}{\partial y}\right) \text { or } \frac{\partial^{2} f}{\partial y^{2}} \text { or } f_{y y}

We also have mixed partial derivatives denoted by,

\frac{\partial}{\partial x}\left(\frac{\partial f}{\partial y}\right) \text { or } \frac{\partial^{2} f}{\partial x \partial y} \text { or } f_{y x}

and

\frac{\partial}{\partial y}\left(\frac{\partial f}{\partial x}\right) \text { or } \frac{\partial^{2} f}{\partial y \partial x} \text { or } f_{x y} .

Assuming that $f_{x y}$ and $f_{y x}$ are continuous, then the order with which we take the derivative, does not matter, i.e. $f_{x y}=f_{y x}$ . Moreover, the mixed partials are equal in higherorder derivatives, assuming the continuity condition holds true. For example, in the case of third-order partial derivatives the following mixed partials, are equal

f_{x x y}=f_{x y x}=f_{y x x} .

Clairaut's theorem

This continuity condition is formally stated in Clairaut's theorem and is generalisable to higherorder partial derivatives given that the continuity condition is satisfied. This implies that partial derivatives may be computed in any order; for example, suppose we have a function $f=f(x, y, z)$ then, $f_{x y y z}=f_{y x z y}$ if the fourth-order partial derivatives are continuous.

The theorum states the conditions for equality of mixed partials. If $f_{x y}$ and $f_{y x}$ are continuous functions on a disk $\mathcal{D}$ , then $f_{x y}(a, b)=f_{y x}(a, b)$ for all points $(a, b) \in \mathcal{D}$ , i.e.

\frac{\partial^{2} f}{\partial x \partial y}=\frac{\partial^{2} f}{\partial y \partial x} .

Gradient and Hessian

Given a function of two variables, $f(x, y)$ , we define the gradient of $f$ , denoted by $\nabla f$ to be the vector of partial derivatives of $f$ :

\nabla f=\left(\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}\right)

The Hessian matrix, denoted by $\mathcal{H} f$ , is a square matrix of second-order partial derivatives of a twice-differentiable function, $f$ :

\mathcal{H} f=\left(\begin{array}{cc} \frac{\partial^{2} f}{\partial x^{2}} & \frac{\partial^{2} f}{\partial x \partial y} \\ \frac{\partial^{2} f}{\partial y \partial x} & \frac{\partial^{2} f}{\partial y^{2}} \end{array}\right)

Since the second-order derivatives are independent of the order in which the derivatives are taken (see Clairaut's theorem in Subsec. 1.2.2), the Hessian matrix is a symmetric matrix. This is easily generalisable to functions of more than 2 variables: for instance, the Hessian matrix of a function $f=f(x, y, z)$ is a $3 \times 3$ matrix whose rows are given by $\left[f_{x x}, f_{x y}, f_{x z}\right],\left[f_{y x}, f_{y y}, f_{y z}\right]$ and $\left[f_{z x}, f_{z y}, f_{z z}\right]$ . We will make use of these definitions in the next section on Taylor expansion.

Stationary points: multi-variable functions Differentiation rules