Artificial Intelligence 🤖
Partial differentiation
Differentiation rules

Differentiation rules

Product rule

The product rule generalises to multivariable functions, without modification. So, for a function of two variables, f(x,y)f(x, y) we have:

fx=uxv+uvx,fy=uyv+uvyf_{x}=u_{x} v+u v_{x}, \quad f_{y}=u_{y} v+u v_{y}

where, here, the subscript denotes partial differentiation.

Chain rule

We can also use the chain rule but we need to differentiate between two cases, which depend on how many variables we are dealing with. Let us first recall that for the single variable case, for a function y=f(u)y=f(u) where u=g(x)u=g(x), the chain rule gives

dydx=dydududx.\frac{d y}{d x}=\frac{d y}{d u} \frac{d u}{d x} .

We outline the two different cases for a function of two variables below.

Case 1

Consider a function h(x,y)=f(u(x,y))h(x, y)=f(u(x, y)) where ff is a function of a single variable, i.e. f=f(u)f=f(u) and uu is a function of two variables, u=u(x,y)u=u(x, y). Then this is analogous to the single-variable case shown above but now we have two equations for the two partials of hh wrt xx and yy. Applying the chain rule to determine hxh_{x}, we have

hx=dfduuxh_{x}=\frac{d f}{d u} \frac{\partial u}{\partial x}

and similarly hyh_{y} is,

hy=dfduuyh_{y}=\frac{d f}{d u} \frac{\partial u}{\partial y}

Case 2

Consider now a function h(x,y)=f(u,v)h(x, y)=f(u, v) where u=u(x,y)u=u(x, y) and v=v(x,y)v=v(x, y). Since hh is a function of xx and yy, the partial derivatives we want to compute are hxh_{x} and hyh_{y}, as above. The rate at which hh changes with xx depends on both uu and vv. The partial derivative of hh wrt xx in this case therefore is given by the chain rule as follows

hx=fuux+fvvxh_{x}=\frac{\partial f}{\partial u} \frac{\partial u}{\partial x}+\frac{\partial f}{\partial v} \frac{\partial v}{\partial x}

and hyh_{y} is

hy=fuuy+fvvy.h_{y}=\frac{\partial f}{\partial u} \frac{\partial u}{\partial y}+\frac{\partial f}{\partial v} \frac{\partial v}{\partial y} .

Note that when differentiating ff partially wrt u(v)u(v), we treat v(u)v(u) as a constant.

The chain rule is used to compute derivatives when changing variables. Suppose we have some quantity ff expressed in polar coordinates so that f(r,θ)f(r, \theta) is known for any rr and θ\theta. Suppose further that we want to differentiate with respect to the Cartesian coordinates, xx and yy [this often arises in the theory of partial differential equations (PDEs)]. Since r=r(x,y)r=r(x, y) and θ=θ(x,y)\theta=\theta(x, y), we can express f(r,θ)f(r, \theta) in terms of x,yx, y, as follows

h(x,y)=f(r(x,y),θ(x,y)),h(x, y)=f(r(x, y), \theta(x, y)),

and apply the chain rule.

hx=frrx+fθθxh_{x}=\frac{\partial f}{\partial r} \frac{\partial r}{\partial x}+\frac{\partial f}{\partial \theta} \frac{\partial \theta}{\partial x}

and for hyh_{y}

hy=frry+fθθy.h_{y}=\frac{\partial f}{\partial r} \frac{\partial r}{\partial y}+\frac{\partial f}{\partial \theta} \frac{\partial \theta}{\partial y} .

In order to obtain hxh_{x}, we need to determine r/x\partial r / \partial x and θ/x\partial \theta / \partial x (since the form of f(r,θ)f(r, \theta) is known, we can easily obtain f/r\partial f / \partial r and f/θ)\partial f / \partial \theta). Using x=rcosθx=r \cos \theta and y=rsinθy=r \sin \theta, we can express rr in terms of xx and yy,

r=x2+y2.r=\sqrt{x^{2}+y^{2}} .

Further, since tanθ=y/x\tan \theta=y / x, taking the inverse yields

θ=tan1(yx)+nπ\theta=\tan ^{-1}\left(\frac{y}{x}\right)+n \pi

where nn is an integer.

Laplacian operator

A function that is not 1 - 1 cannot have an inverse unless its domain is restricted. Recall that periodic functions are not 111-1. For the inverse function of the single-argument tan1x\tan ^{-1} x we restrict the domain to (π/2,π/2)(-\pi / 2, \pi / 2); this is an open interval, i.e. the endpoints are not included:

The graph of y=\tan x in -2 \pi \leq x \leq 2 \pi

The graph of y=tanxy=\tan x in 2πx2π-2 \pi \leq x \leq 2 \pi. The red part of the graph is in π/2<x<π/2-\pi / 2<x<\pi / 2.

The inverse of the red part of graph given by y=\tan ^{-1} x

The inverse of the red part of graph given by y=tan1xy=\tan ^{-1} x; its range is π/2<x<π/2-\pi / 2<x<\pi / 2.

Now, let us go back to the two-argument arctan function, θ=tan1(y/x)\theta=\tan ^{-1}(y / x). We know that θ\theta is defined in a circle π<θ<π-\pi<\theta<\pi. However, θ\theta is calculated from tan1(y/x)\tan ^{-1}(y / x) and we know that π/2<tan1(y/x)<π/2-\pi / 2<\tan ^{-1}(y / x)<\pi / 2. We need to adjust the calculated value of θ\theta from tan1(y/x)\tan ^{-1}(y / x) such that it satisfies the sign of the arguments yy and xx that were used to calculate it. To see this consider the unit circle (r=1)(r=1) shown below showing the four quadrants of a Cartesian coordinate system.

unit circle (r=1)

For x>0,y>0x>0, y>0, we are in the first quadrant, y/x>0y / x>0 where 0<θ<π/20<\theta<\pi / 2. Indeed, from the graph for y=tan1xy=\tan ^{-1} x, for y/x>0y / x>0 we would see that θ\theta^{*} lies in (0,π/2)(0, \pi / 2). It follows that our equation:

θ=tan1(yx)+nπ\theta=\tan ^{-1}\left(\frac{y}{x}\right)+n \pi

yields θ=θ\theta=\theta^{*}, where θ\theta^{*} represents the angle which satisfies the signs of the arguments xx and yy, and thus we set n=0n=0. This is also true when x>0,y<0x>0, y<0. Moreover, we can show that x<0,y>0x<0, y>0, we take n=1n=1 and for x<0,y<0x<0, y<0 we take n=1n=-1.

However, since we are looking for the partial derivative of θ\theta wrt xx and yy, the constant nπn \pi vanishes upon differentiation and does not enter the final result.

From our eqn expressing rr in terms of xx and yy,

r=x2+y2.r=\sqrt{x^{2}+y^{2}} .

differentiating partially wrt xx gives:

rx=xx2+y2\frac{\partial r}{\partial x}=\frac{x}{\sqrt{x^{2}+y^{2}}}

and from Eqn for θ\theta, we have:

θx=yx2+y2\frac{\partial \theta}{\partial x}=-\frac{y}{x^{2}+y^{2}}

where we have used the result,

ddztan1z=11+z2\frac{d}{d z} \tan ^{-1} z=\frac{1}{1+z^{2}}

To see this, let f(z)=tan1zf(z)=\tan ^{-1} z such that

tanf(z)=z.\tan f(z)=z .

Differentiating implicitly gives,

sec2f(z)dfdz=1dfdz=1sec2f(z)\sec ^{2} f(z) \frac{d f}{d z}=1 \Rightarrow \frac{d f}{d z}=\frac{1}{\sec ^{2} f(z)}

Using the identity sec2f(z)tan2f(z)+1\sec ^{2} f(z) \equiv \tan ^{2} f(z)+1,

dfdz=1tan2f(z)+1\frac{d f}{d z}=\frac{1}{\tan ^{2} f(z)+1}

which is

dfdz=1z2+1\frac{d f}{d z}=\frac{1}{z^{2}+1}

since tan2f(z)=z2\tan ^{2} f(z)=z^{2}. Note that as z±,df/dz0z \rightarrow \pm \infty, d f / d z \rightarrow 0.

In polar coordinates, r/x\partial r / \partial x and θ/x\partial \theta / \partial x are given by:

rx=cosθ,θx=sinθr\frac{\partial r}{\partial x}=\cos \theta, \quad \frac{\partial \theta}{\partial x}=-\frac{\sin \theta}{r}

Therefore, back in from applying the chain rule to determine hxh_{x}:

hx=dfduuxh_{x}=\frac{d f}{d u} \frac{\partial u}{\partial x}

which leads to:

hx=frcosθfθsinθr.h_{x}=\frac{\partial f}{\partial r} \cos \theta-\frac{\partial f}{\partial \theta} \frac{\sin \theta}{r} .

The result can be written in the operator form

x=cosθrsinθrθ\frac{\partial}{\partial x}=\cos \theta \frac{\partial}{\partial r}-\frac{\sin \theta}{r} \frac{\partial}{\partial \theta}

A similar formula can be obtained for /y\partial / \partial y using expressions for r/y\partial r / \partial y and θ/y\partial \theta / \partial y. This is given by ,

y=sinθr+cosθrθ\frac{\partial}{\partial y}=\sin \theta \frac{\partial}{\partial r}+\frac{\cos \theta}{r} \frac{\partial}{\partial \theta}

To obtain the second partial derivative operators, 2/x2\partial^{2} / \partial x^{2} and 2/y2\partial^{2} / \partial y^{2}, we partially differentiate wrt xx and yy, respectively. We can then show that their sum simplifies to

2x2+2y2=2r2+1rr+1r22θ2\frac{\partial^{2}}{\partial x^{2}}+\frac{\partial^{2}}{\partial y^{2}}=\frac{\partial^{2}}{\partial r^{2}}+\frac{1}{r} \frac{\partial}{\partial r}+\frac{1}{r^{2}} \frac{\partial^{2}}{\theta^{2}}

This is known as the Laplacian operator and it is extremely useful in the theory of PDEs.

Implicit differentiation

Suppose we have z=F(x,y)z=F(x, y) with domain DD with y=y(x)y=y(x). We put the function in the form:

F(x,y)=0; F(x, y)=0 ;

if the RHS is not zero, we move everything to the left to get it in this form. Now, suppose we want to compute the derivative dydx\frac{d y}{d x}. Sometimes we can proceed by solving Eq. (1.30) for yy but that is not always possible. Assume we have a point (x0,y0)D\left(x_{0}, y_{0}\right) \in D such that F(x0,y0)=0F\left(x_{0}, y_{0}\right)=0 (so that y(x0)=y0y\left(x_{0}\right)=y_{0} ). Now, along the curve

F(x,y(x))=0, F(x, y(x))=0,

is a function of xx only. We can differentiate with respect to xx on both sides:

ddx[F(x,y(x))]=0Fx(x,y(x))dxdx+Fy(x,y(x))dydx=0. \begin{aligned} \frac{d}{d x}[F(x, y(x))] & =0 \\ \frac{\partial F}{\partial x}(x, y(x)) \frac{d x}{d x}+\frac{\partial F}{\partial y}(x, y(x)) \frac{d y}{d x} & =0 . \end{aligned}

Using dxdx=1\frac{d x}{d x}=1, Eq. (1.32) gives dydx\frac{d y}{d x} as:

dydx=Fx(x,y(x))Fy(x,y(x)) \frac{d y}{d x}=-\frac{F*{x}(x, y(x))}{F*{y}(x, y(x))}

provided that Fy0F_{y} \neq 0. Recall that the subscripts in FxF_{x} and FyF_{y} denote partial differentiation with respect to xx and yy, respectively. At the point (x0,y0)\left(x_{0}, y_{0}\right), we have:

dydx(x0)=Fx(x0,y0)Fy(x0,y_0). \frac{d y}{d x}\left(x*{0}\right)=-\frac{F*{x}\left(x*{0}, y*{0}\right)}{F*{y}\left(x*{0}, y\_{0}\right)} .

Equation (1.34) gives the slope of the contour line at the point we started, i.e. in this case (x0,y0)\left(x_{0}, y_{0}\right). Of course we can vary the point (x0,y0)\left(x_{0}, y_{0}\right) in the domain DD.

Example 1.11 Let xy=ln(2x+y)x y=\ln (2 x+y) define a curve on the xyx y-plane. Compute the derivative dydx\frac{d y}{d x} at the point (x0,y0)=(0,1)\left(x_{0}, y_{0}\right)=(0,1).

Solution First note that we cannot 'solve for yy ' in this case since yy appears in the ln function. We proceed by defining:

F(x,y)=xyln(2x+y)=0,D:2x+y>0. F(x, y)=x y-\ln (2 x+y)=0, \quad D: 2 x+y>0 .

Note that the domain, DD is defined as 2x+y>02 x+y>0 since the natural logarithm takes only real positive numbers as the argument. We use Eq. (1.33) to compute dydx\frac{d y}{d x} :

dydx=Fx(x,y)Fy(x,y)=y22x+yx12x+y=y(2x+y)2x(2x+y)1. \frac{d y}{d x}=-\frac{F*{x}(x, y)}{F*{y}(x, y)}=-\frac{y-\frac{2}{2 x+y}}{x-\frac{1}{2 x+y}}=-\frac{y(2 x+y)-2}{x(2 x+y)-1} .

At (0,1)(0,1), Eq. (1.36) gives dydx=1\frac{d y}{d x}=-1. The same reasoning can be extended to functions of more than two variables, e.g. w=F(x,y,z)\quad w=F(x, y, z). Now, F(x,y,z)=0F(x, y, z)=0 are level surfaces which can be represented as the graph of a function, z=f(x,y)z=f(x, y). Again, sometimes we can 'solve for zz ' and sometimes we cannot. In any case, we can always compute the partial derivatives. So, with z=f(x,y)z=f(x, y), the level surface is:

F(x,y,f(x,y))=k. F(x, y, f(x, y))=k .

Then the derivatives of FF are:

x[F(x,y,f(x,y))]=Fx+Fzfxy[F(x,y,f(x,y))]=Fy+Fzfy. \begin{aligned} \frac{\partial}{\partial x}[F(x, y, f(x, y))] & =\frac{\partial F}{\partial x}+\frac{\partial F}{\partial z} \frac{\partial f}{\partial x} \\ \frac{\partial}{\partial y}[F(x, y, f(x, y))] & =\frac{\partial F}{\partial y}+\frac{\partial F}{\partial z} \frac{\partial f}{\partial y} . \end{aligned}

Now, from Eqs. (1.38), we can obtain the partial derivatives of the implicit function f(x,y)f(x, y) which defines the level surface of FF through (x,y,z)(x, y, z). Solving for fxf_{x} and fyf_{y} from Eqs. (1.38) yields,

fx=Fx(x,y,z)Fz(x,y,z),fy=Fy(x,y,z)Fz(x,y,z). f*{x}=-\frac{F*{x}(x, y, z)}{F*{z}(x, y, z)}, \quad f*{y}=-\frac{F*{y}(x, y, z)}{F*{z}(x, y, z)} .