Artificial Intelligence 🤖
Differential equations
First order PDEs

First order PDEs

In this section, we describe a general technique for solving first-order equations. We start our discussion with solutions to the transport equation to demonstrate the geometric concept behind PDE solution methods. We then move on to semilinear, first-order PDEs to introduce the idea behind the method of characteristics. These ideas can then be extended to the more complicated PDEs given by quasilinear and fully nonlinear cases. Before we continue however, we expand on the classification of the PDEs discussed in Sec. 2.1 giving a broader definition of the various classes of PDEs we can encounter.

  1. A PDE of order nn is quasilinear if it is linear in the derivatives of order nn with coefficients that depend on the independent variables and derivatives of the unknown function of order strictly less than nn.
  2. A quasilinear PDE where the coefficients of derivatives of order nn are functions of the independent variables alone is called semilinear.
  3. A PDE which is linear in the unknown function and all its derivatives with coefficients depending on the independent variables alone is called linear.
  4. A PDE which is not quasilinear is fully nonlinear.

The transport equation

The simplest transport equation takes the form,

ut+cux=0,u_{t}+c u_{x}=0,

where cc is a constant and u=u(x,t)u=u(x, t). We seek functions u(x,t)u(x, t) that satisfy the transport equation. We approach this by looking a solution method via the directional derivative. The directional derivative of uu in the direction of a vector v\boldsymbol{v} is given by:

v^u\hat{\boldsymbol{v}} \cdot \nabla u

where v^\hat{\boldsymbol{v}} is the unit vector in the direction of v\boldsymbol{v}. In particular, if the directional derivative in Eq. (2.19) is 0 at all points then uu is constant along all lines that are tangential to v\boldsymbol{v}. This can be thought of as a generalisation of a partial derivative. Now, by rewriting the transport equation as a dot product, we have

ut+cux=(c,1)(ux,ut)=0.u_{t}+c u_{x}=(c, 1) \cdot\left(u_{x}, u_{t}\right)=0 .

Notice that the RHS of Eq. (2.20) is almost the directional derivative of uu in the direction of v\boldsymbol{v}. To make it look exactly like the directional derivative, we can simply divide by c2+1\sqrt{c^{2}+1}, i.e.:

1c2+1(c,1)(ux,ut)=0\frac{1}{\sqrt{c^{2}+1}}(c, 1) \cdot\left(u_{x}, u_{t}\right)=0

which is equivalent to Eq. (2.19). Hence, every solution uu must be constant along lines that are tangential to (c,1)(c, 1). In this case, the lines are also parallel to the vector (c,1)(c, 1) (see Fig. 2.1). These lines, since they are parallel to (c,1)(c, 1), they have slope,

dxdt=1c.\frac{d x}{d t}=\frac{1}{c} .

Upon integrating, we obtain:

x=ctx0,x=c t-x_{0},

where x0x_{0} is a constant. Rearranging to give

xct=x0,x-c t=x_{0},

we can see that the equation describes infinitely many lines which are tangential and parallel to the vector (c,1)(c, 1). Along each of these lines uu is constant and, so, uu depends only on xcx-c. The general solution to the PDE therefore takes the form,

u(x,t)=f(xct),u(x, t)=f(x-c t),

where ff is some arbitrary, differentiable function.

Method of characteristics

Consider the PDE:

a(x,y)ux+b(x,y)ux=f(x,y,u)a(x, y) u_{x}+b(x, y) u_{x}=f(x, y, u)

where u=u(x,y)u=u(x, y). Equation (2.24) is semilinear and it includes the linear form,

a(x,y)ux+b(x,y)ux+c(x,y)u=g(x,y),a(x, y) u_{x}+b(x, y) u_{x}+c(x, y) u=g(x, y),

Figure 2.1: Characteristic curves; the PDE solution is constant along each curve in (x,t)(x, t) space and each curve is equal to a different constant value, x0x_{0}. The initial data is propagated along these characteristic curves.

with f(x,y,u)=g(x,y)c(x,y)uf(x, y, u)=g(x, y)-c(x, y) u. We also consider the initial condition given by:

u(x,0)=u0(x).u(x, 0)=u_{0}(x) .

Suppose we have found a solution u(x,y)u(x, y) to Eq. (2.24). Plotting the relationship of the solution as it varies with the independent variables xx and yy, gives a surface, z=u(x,y)z=u(x, y). Let us denote this surface by S=u(x,y)zS=u(x, y)-z, or equivalently,

S(x,y,u):=u(x,y)u=0,S(x, y, u):=u(x, y)-u=0,

for all real variables xx and yy. A surface u=u(x,y)u=u(x, y) is a solution to the PDE and is known as the integral surface. In Eq. (2.27), u(x,y)u(x, y) is a function of xx and yy where uu is a variable. For instance, if u(x,y)=xyu(x, y)=x-y then S=xyuS=x-y-u. From vector calculus, we know that the gradient, S\nabla S gives a normal to the surface S=0S=0, i.e.:

N=S=(ux,uy,1).\mathbf{N}=\nabla S=\left(u_{x}, u_{y},-1\right) .

This is a downward-pointing vector (because of the -1 term in the zz-direction), as shown in Fig. 2.2. Note that the PDE (2.24) may be rewritten in the following dot product form:

(a(x,y),b(x,y),f(x,y,u))(ux,uy,1)=0.(a(x, y), b(x, y), f(x, y, u)) \cdot\left(u_{x}, u_{y},-1\right)=0 .

Equation (2.29) is the scalar product of two vectors; a zero dot product indicates that the two vectors are at right angles. We can write Eq.

(2.29) as:

(a(x,y),b(x,y),f(x,y,u))N=0(a(x, y), b(x, y), f(x, y, u)) \cdot \mathbf{N}=0

The vector,

v=(a(x,y),b(x,y),f(x,y,u))\boldsymbol{v}=(a(x, y), b(x, y), f(x, y, u))

is normal to N\boldsymbol{N}. Now, since N\boldsymbol{N} is normal to the surface S=0S=0, this means that v\boldsymbol{v} is tangential to the surface S=0S=0 which, in turn, implies that v\boldsymbol{v} lies in the tangent plane to SS (see Fig. 2.2).

Figure 2.2: The surface SS showing the normal to the surface, N(x,y)N(x, y) at a point (x,y,u(x,y))(x, y, u(x, y)) and the vector v\boldsymbol{v} [given by Eq. (2.31)] which lies in the tangent plane to SS.

As a result, the PDE dictates that any integral surface through a given point must be tangent to v\boldsymbol{v}. This means that we start from an initial condition (which lies in the integral surface) and we move in the direction of v\boldsymbol{v}; since v\boldsymbol{v} lies in the tangent plane, we move along a curve which lies within S=0S=0. This curve is referred to as the characteristic curve. The question is how do we go about constructing such a surface? We start by describing points on a characteristic curve through the parametrisation,

r(t)=(x(t),y(t),u(t)).\boldsymbol{r}(t)=(x(t), y(t), u(t)) .

Differentiating r\boldsymbol{r} with respect to tt gives the tangent vector

r(t)=(x(t),y(t),u(t))\boldsymbol{r}^{\prime}(t)=\left(x^{\prime}(t), y^{\prime}(t), u^{\prime}(t)\right)

where the primes denote differentiation with respect to tt. This tangent vector, r(t)\boldsymbol{r}^{\prime}(t) then, belongs to the tangent plane to the surface S=0S=0 at a given point. Recall that the characteristic curve moves in the direction of v\boldsymbol{v} and as such, v\boldsymbol{v} and r\boldsymbol{r}^{\prime} must be proportional. Therefore, we have

r=λv\boldsymbol{r}^{\prime}=\lambda \boldsymbol{v}

for some λ\lambda. Using Eqs. (2.31), (2.33) and (2.34), we can write down the following:

dx/dta(x,y)=dy/dtb(x,y)=du/dtf(x,y,u).\frac{d x / d t}{a(x, y)}=\frac{d y / d t}{b(x, y)}=\frac{d u / d t}{f(x, y, u)} .

Or, in differential form:

dxa(x,y)=dyb(x,y)=duf(x,y,u).\frac{d x}{a(x, y)}=\frac{d y}{b(x, y)}=\frac{d u}{f(x, y, u)} .

From Eq. (2.36), we can form two pairs of ODEs:

dydx=b(x,y)a(x,y),dudx=f(x,y,u)a(x,y)\frac{d y}{d x}=\frac{b(x, y)}{a(x, y)}, \frac{d u}{d x}=\frac{f(x, y, u)}{a(x, y)}

or

dxdy=a(x,y)b(x,y),dudy=f(x,y,u)b(x,y).\frac{d x}{d y}=\frac{a(x, y)}{b(x, y)}, \frac{d u}{d y}=\frac{f(x, y, u)}{b(x, y)} .

The solutions to Eq. (2.37) [or Eq. (2.38)] determine the characteristics of the PDE. Example 2.1 Find a solution to the following semilinear PDE,

xux+yuy=xeu,x>0.x u_{x}+y u_{y}=x e^{-u}, x>0 .

Solution We start off by writing the ODEs as in Eq. (2.36):

dxx=dyy=duxeu\frac{d x}{x}=\frac{d y}{y}=\frac{d u}{x e^{-u}}

The objective here is to find a pair of differential equations which are easy to solve with basic ODE techniques. As mentioned previously, we have the two options given by Eqs. (2.37) and (2.38). We see that if we choose the first pair [Eq. (2.37)], we have:

dxx=dyydydx=yx,\frac{d x}{x}=\frac{d y}{y} \rightarrow \frac{d y}{d x}=\frac{y}{x},

which we can easily solve to obtain,

yx=c1\frac{y}{x}=c_{1}

The second ODE in the pair yields:

dudx=eu\frac{d u}{d x}=e^{-u}

which, upon integration, gives:

eux=c2.e^{u}-x=c_{2} .

Now, remember that we are solving these ODEs along characteristic curves. Equation (2.44) gives us uu through c2c_{2} (note that c2c_{2} varies between characteristics but it remains constant along the same characteristic). The particular characteristic curve we are considering is given in terms of the independent variables xx and yy whose relationship is dependent on the constant c1c_{1}. For this reason, c1c_{1} and c2c_{2} are related. At this point we don't exactly know how they are related so we introduce some arbitrary (but differentiable) function for their relationship:

c2=g(c1).c_{2}=g\left(c_{1}\right) .

Hence, using Eqs. (2.42), (2.44) and (2.45), we have:

eu=x+g(y/x),e^{u}=x+g(y / x),

which is an implicit, general solution to Eq. (2.55). In Example 2.1, the solution to the PDE is given in a general form. To determine the functionality of gg, we need to apply the initial condition given by u(x,0)=u(x)u(x, 0)=u(x) where uu is a function of a single variable.

Summary of the method

Suppose we have a semilinear PDE of the general form given by Eq. (2.24). To solve using the method of characteristics:

  • Consider the system:
dxa(x,y)=dyb(x,y)=duf(x,y,u).\frac{d x}{a(x, y)}=\frac{d y}{b(x, y)}=\frac{d u}{f(x, y, u)} .

The equations in (2.47) are equivalent to the PDE.

  • Solve a pair of ODEs to obtain the functions h(x,y)=c1h(x, y)=c_{1} and j(x,y,u)=j(x, y, u)= c2c_{2}.
  • In particular, solve the following ODE to obtain the function h(x,y)=c1h(x, y)=c_{1},
dxdy=a(x,y)b(x,y)( or dydx=b(x,y)a(x,y)).\frac{d x}{d y}=\frac{a(x, y)}{b(x, y)} \quad\left(\text { or } \frac{d y}{d x}=\frac{b(x, y)}{a(x, y)}\right) .
  • And the following for j(x,y,u)=c2j(x, y, u)=c_{2}
dudy=f(x,y,u)b(x,y)( or dudx=f(x,y,u)a(x,y)).\frac{d u}{d y}=\frac{f(x, y, u)}{b(x, y)} \quad\left(\text { or } \frac{d u}{d x}=\frac{f(x, y, u)}{a(x, y)}\right) .
  • Relate the constants c1c_{1} and c2c_{2} through an arbitrary but differentiable function, gg, i.e. j(x,y,u)=g(h(x,y))j(x, y, u)=g(h(x, y)).
  • The function gg can be chosen to satisfy an initial condition of the form u(x,0)=u0(x)u(x, 0)=u_{0}(x) where u0u_{0} is a function of a single variable.

Quasilinear equations

Next, we look at quasilinear PDEs. Note that by Definition 2.2, a quasilinear PDE may also be linear or nonlinear since the definition places no restriction on the function uu itself. The method outlined above can be applied to quasilinear equations as well. Next, we walk through an example as the method can be a little trickier to use in more complicated cases.

Consider the quasilinear PDE of the following form,

a(x,y,u)ux+b(x,y,u)uy=f(x,y,u).a(x, y, u) u_{x}+b(x, y, u) u_{y}=f(x, y, u) .

Following Eq. (2.47), we have:

dxa(x,y,u)=dyb(x,y,u)=duf(x,y,u)\frac{d x}{a(x, y, u)}=\frac{d y}{b(x, y, u)}=\frac{d u}{f(x, y, u)}

From Eqs. (2.51), if we can set up two first-order ODEs [i.e. analogous to the pairs in Eqs. (2.48) or (2.49)], then it is possible to obtain two equations of the form:

h(x,y,u)=c1,j(x,y,u)=c2.h(x, y, u)=c_{1}, \quad j(x, y, u)=c_{2} .

Just like in the case of semilinear PDEs (and as seen in Example 2.1) the constants c1c_{1} and c2c_{2} are related and hence the solution to the PDE (2.50), is given by:

j(x,y,u)=F(h(x,y,u)),j(x, y, u)=F(h(x, y, u)),

where FF is an arbitrary (but differentiable) function. Example 2.2 Solve the following quasilinear PDE,

(y+u)ux+yuy=xy.(y+u) u_{x}+y u_{y}=x-y .

Solution Using Eq. (2.36), we write:

dx/dty+u=dy/dty=du/dtxy\frac{d x / d t}{y+u}=\frac{d y / d t}{y}=\frac{d u / d t}{x-y}

Recall the objective is to form first-order ODEs in the form given by Eqs. (2.48) and (2.49). The algebra in more complicated first-order PDEs tends to be more tedious. Here, we make use of the following:

ab=cda+cb+d=cd.\frac{a}{b}=\frac{c}{d} \Leftrightarrow \frac{a+c}{b+d}=\frac{c}{d} .

Then, we have:

dxdty+u=dydt+dudty+(xy)=ddt(y+u)x.\frac{\frac{d x}{d t}}{y+u}=\frac{\frac{d y}{d t}+\frac{d u}{d t}}{y+(x-y)}=\frac{\frac{d}{d t}(y+u)}{x} .

We now have the following ODE:

xdxdt=(y+u)ddt(y+u).x \frac{d x}{d t}=(y+u) \frac{d}{d t}(y+u) .

Rearranging gives:

xdxdt(y+u)ddt(y+u)=0,x \frac{d x}{d t}-(y+u) \frac{d}{d t}(y+u)=0,

which is equivalent to:

12ddt(x2(y+u)2)x2(y+u)2=c1.\frac{1}{2} \frac{d}{d t}\left(x^{2}-(y+u)^{2}\right) \rightarrow x^{2}-(y+u)^{2}=c_{1} .

Similarly, we have:

dudtxy=dxdtdydt(y+u)y=ddt(xy)u,\frac{\frac{d u}{d t}}{x-y}=\frac{\frac{d x}{d t}-\frac{d y}{d t}}{(y+u)-y}=\frac{\frac{d}{d t}(x-y)}{u},

which is:

ddt(u2(xy)2)=0,u2(xy)2=c2.\frac{d}{d t}\left(u^{2}-(x-y)^{2}\right)=0, \rightarrow u^{2}-(x-y)^{2}=c_{2} .

Finally, we relate the two constants through Eq. (2.53), i.e. c2=F(c1)c_{2}=F\left(c_{1}\right), to obtain the PDE solution as:

u2(xy)2=F(x2(y+u)2),u^{2}-(x-y)^{2}=F\left(x^{2}-(y+u)^{2}\right),

where FF is an arbitrary, differentiable function.

Traffic flow

We now consider an application, namely one-dimensional and one-directional traffic flow. We consider cars on a single-lane road with no entry or exit ramps such that we have conservation of cars. We take xx to denote a unique point on the road at time tt. While cars are discrete objects, we model this in a continuum sense using the notion of traffic density. To ensure that the model is valid, we assume that we are dealing on a large length scale. Then, since car flow is conserved, we can model the traffic flow with the following continuity equation,

ρt+qx=0.\rho_{t}+q_{x}=0 .

The traffic density is denoted by ρ(x,t)\rho(x, t) and it represents the number of cars per unit length at position xx and time tt. The traffic flow or flux is given by q(x,t)q(x, t) at position xx and time tt and it represents the number of cars passing a fixed point in xx per unit time. Before we proceed, we need a constitutive relation between ρ\rho and qq; we will get one by considering the speed at which the cars move. Suppose this is given by cc but this quantity dpes not represent a constant speed. Of course the speed at which cars move on the freeway is a factor of many things but, keeping things simple, let us say that this is predominantly be affected by the traffic density ρ\rho such that c=c(ρ)c=c(\rho). If the traffic density were very low (i.e. very close to zero) then, the cars would be allowed to move at a maximum (hopefully, legal!) speed, say c0c_{0}. The traffic density cannot reach infinity as this would be physically impossible and must therefore be bounded by a maximum value; this may be represented by how many cars can fit on an arbitrary length of the freeway when they are closely packed (i.e. bumper-to-bumper) together. As the traffic density approaches this maximum value, the speed cc approaches zero. Suppose this maximum value is denoted by ρmax\rho_{\max }. It follows that the relation we are looking for (i.e. cc0c \rightarrow c_{0} as ρ0\rho \rightarrow 0 and c0c \rightarrow 0 as ρρmax\rho \rightarrow \rho_{\max }, may be of the following form,

c(ρ)=c0(1ρρmax).c(\rho)=c_{0}\left(1-\frac{\rho}{\rho_{\max }}\right) .

Now, since qq represents the flux, it represents the amount of cars per unit time per unit area. This is therefore given by:

q=c(ρ)ρ.q=c(\rho) \rho .

We take c0=1c_{0}=1 and ρmax=1\rho_{\max }=1 for simplicity and, by differentiating Eq. (2.66) with respect to xx, we have:

qx=((1ρ)ρ)x,=(12ρ)ρx.\begin{aligned} q_{x} & =((1-\rho) \rho)_{x}, \\ & =(1-2 \rho) \rho_{x} . \end{aligned}

Hence, Eq. (2.64) becomes,

ρt+(12ρ)ρx=0.\rho_{t}+(1-2 \rho) \rho_{x}=0 .

This is a quasilinear PDE in ρ(x,t)\rho(x, t) so we may apply the method of characteristics. Remember these are curves in the xtx t-plane on which solutions to Eq. (2.68) are constant. If we now apply Eqs. (2.37) [note: we could of course apply Eq. (2.38) instead], we have the following ODEs:

dxdt=(12ρ),dρdt=0.\frac{d x}{d t}=(1-2 \rho), \frac{d \rho}{d t}=0 .

Recall that ρ=ρ(x,t)\rho=\rho(x, t) and so the first equation in (2.69) does not help us significantly in finding the characteristics; however, the second equation saves the day. The second equation, dρdt=0\frac{d \rho}{d t}=0 tells us that the value of ρ\rho along the characteristics curves is constant, say ρ0\rho_{0} which depends on the value of x0x_{0} at time t=0t=0. Thus, the first equation in Eq. (2.69) becomes,

dxdt=(12ρ0(x0,0))\frac{d x}{d t}=\left(1-2 \rho_{0}\left(x_{0}, 0\right)\right)

with ρ0\rho_{0} constant which integrates to:

x=(12ρ0(x0,0))t+x0.x=\left(1-2 \rho_{0}\left(x_{0}, 0\right)\right) t+x_{0} .

Equation (2.71) implies that the characteristics are straight lines with x0x_{0} being the starting point. Different initial conditions therefore yield different characteristic curves (i.e. different slope). Since we typically plot tt against xx, it is handy to have an equation that relates tt to xx; from Eq. (2.71), we have:

t=xx0(12ρ0(x0,0)).t=\frac{x-x_{0}}{\left(1-2 \rho_{0}\left(x_{0}, 0\right)\right)} .

Figure 2.3 shows various characteristic curves; on the leftmost plot, we show a single curve passing through x0x_{0}. Now, ρ(x0,0)\rho\left(x_{0}, 0\right) represents the initial distribution of the traffic density which varies according to the value of xx at t=0t=0. Let us call this function f(x)f(x) i.e. ρ(x,0)=f(x)\rho(x, 0)=f(x). At two distinct points, say x0x_{0} and x1x_{1}, the value of the density function is f(x0)f\left(x_{0}\right) and f(x1)f\left(x_{1}\right), respectively. Suppose that f(x0)>f(x1)f\left(x_{0}\right)>f\left(x_{1}\right). Graphically, this implies that the slope of the characteristic at x0x_{0} is less steep than that of the characteristic at x11x_{1} 1 (see the middle plot in Fig. 2.3). If the value of ff decreases continuously as we move in the direction of increasing xx, the slopes (i.e. dx/dtd x / d t ) increase producing a somewhat 'fan'-like structure as shown in Fig. 2.3. Since the traffic density is less at x1x_{1} than at x0x_{0}, the traffic is moving faster at x1x_{1}. It follows that there is a gradual separation between the traffic that started at x0x_{0} and at x1x_{1}.

Figure 2.3: Characteristic curves in the xtx t-plane. The curve emanates from the xx-axis satisfying a starting point x0x_{0} (left). Two characteristic curves starting from initial points x0x_{0} and x1x_{1} (middle). Gradual separation between the points x0x_{0} and x1x_{1}, showing a 'fan'-like pattern (right).

A question that arises is the following: what if the traffic density starts low at x0x_{0} and increases as xx increases? Then, the respective characteristic curves will be pointing inward and toward each other. The fate of these curves will ultimately be that they intersect. Now, recall that the model involves a single-lane road and therefore cars cannot pass other cars. If the speed is less at x1x_{1} compared to the speed at x0x_{0}, then cars are approaching the cars in front of them and are forced to reduce their speed to the lower speed the cars in the front maintain. We will see the effect this has on the solutions in the next subsection.

Finally, to conclude this subsection, we note that in order to compute the solution ρ(x,t)\rho(x, t) at any point, we find the characteristics using Eq. (2.72), follow it back to the initial point (x0,0)\left(x_{0}, 0\right) and determine ρ(x,t)=f(x0)\rho(x, t)=f\left(x_{0}\right). For the model we looked at in this section where we have heavier traffic to the left than to the right, a possible initial condition given by ρ(x,0)=f(x)\rho(x, 0)=f(x) may be:

f(x)={0.25 for x<00.25(1x2)2 for x<10 for x1f(x)=\left\{\begin{array}{cc} 0.25 & \text { for } x<0 \\ 0.25\left(1-x^{2}\right)^{2} & \text { for } x<1 \\ 0 & \text { for } x \geq 1 \end{array}\right.

We make the following observation: recall that the maximum density was scaled by ρmax=1\rho_{\max }=1, therefore the initial constant density for x<0x<0 is merely a quarter of the maximum. It is possible that the characteristic curves intersect at some point in time even if light traffic heads into heavier traffic.

1{ }^{1} Note that the slope is defined as the change of xx with tt; however, we are plotting tt against xx and so in Fig. 2.3 the slope appears to be less at x1x_{1} compared to x0x_{0}.

Shocks

As mentioned in the previous subsection, it is possible that more than one characteristic passes through a given point in the xtx t-plane (note: this is a consequence of the nonlinear term given by ρρx\rho \rho_{x} ). This is in fact a very likely scenario leading to the overlap of the characteristics for sufficiently large time. The point at which characteristics first touch is called a shock wave or, simply, a shock. The term comes from gas dynamics where this phenomenon was first encountered.

Consider the traffic flow equation [given by Eq. (2.68)] with the following initial condition,

f(x)={0 for x<10.25[x2(2x)2] for 1x00.25 for x>1f(x)=\left\{\begin{array}{cc} 0 & \text { for } x<-1 \\ 0.25\left[x^{2}(2-x)^{2}\right] & \text { for }-1 \leq x \leq 0 \\ 0.25 & \text { for } x>1 \end{array}\right.

This represents a case where there is fast traffic coming from behind, there's a relatively slow moving traffic at the very front and a transition region where the traffic is slowing down - this is in 0<x<10<x<1. For such a case, at some point in time, the characteristics overlap (see Fig. 2.4). Once the characteristics intersect, the density function is multi-valued at the intersection point since the density function is composed of points where ρ=f(x0)\rho=f\left(x_{0}\right) and ρ=f(x1)\rho=f\left(x_{1}\right). Now,

Figure 2.4: Characteristic curves in the xtx t-plane. The characteristics are pointing inward as faster-moving traffic from behind is catching up to slower-moving traffic in the front.

what happens in terms of the traffic density solution? As Eq. (2.183) indicates the traffic density starts low and gradually approaches the value of 0.25 within 0<x<10<x<1; thereafter, the density remains constant at 0.25 at time t=0t=0. As time progresses, the solution gets steeper and steeper until it becomes vertical which coincides with when the shock wave develops. After the shock develops, the solution in the form we derived in in the previous section becomes invalid: it becomes a multi-valued function which poses a problem as we don't know which value to assign at the solution where the curves intersect. So what can be done? Two things: either we modify the model (which will consequently give a different solution) such that shocks are prevented or, modify the solution procedure. We proceed with the latter.

We give an alternative definition for the 'solution' to the PDE. More specifically, we allow the solution to become vertical at some point in time. This implies that at that point in time, the solution has a jump discontinuity (see notes from Topic B1 for definition). The solution at that point is no longer differentiable and hence cannot be a solution to the PDE in the classical sense. We can find ways around this though. We can think of a solution with a jump discontinuity as a shock, travelling in the direction of increasing xx as time increases (see Fig. 2.5). This 'solution' should be valid if we can show that the speed of the shock does not violate the principle of conservation (recall that in deriving the model we assume that the number of cars entering and leaving a certain length of the freeway, is constant). We can therefore divide the domain in two regions: upstream and downstream of the shock. A solution to a PDE that does not need to be continuous is referred to as a weak solution. A strong or classical solution is one that it is defined by a continuous function.

Figure 2.5: Shock formation. At t=t1t=t_{1} a shock forms with a jump discontiniuity at xs(t1)x_{s}\left(t_{1}\right). At a later time, t2t_{2}, the jump discontinuity is at xs(t2)x_{s}\left(t_{2}\right).

Suppose that a shock develops and moves with position xs(t)x_{s}(t) and now, the solution ρ(x,t)\rho(x, t) has a jump discontinuity at xsx_{s} resulting in the values of the traffic density at xsϵx_{s}-\epsilon and at xs+ϵx_{s}+\epsilon, (where ϵ>0\epsilon>0 and small) to be finite but not equal. We introduce the following notation: xs=xsϵx_{s_{-}}=x_{s}-\epsilon and xs+=xs+ϵx_{s_{+}}=x_{s}+\epsilon. In particular, we have:

ρ(x,t)={ρ(xs,t),x<xsρ(xs+,t),x>xs\rho(x, t)= \begin{cases}\rho\left(x_{s_{-}}, t\right), & x<x_{s} \\ \rho\left(x_{s_{+}}, t\right), & x>x_{s}\end{cases}

Away from the shock, the solution is described by a smooth function, ρ(x,t)\rho(x, t). To determine the position and hence the speed of the shock wave, we demand the following:

  1. The original PDE describing the model, i.e. Eq. (2.68) needs to be satisfied on either side of the shock: at x<xsx<x_{s} and x>xsx>x_{s} for all time, tt. So, using the characteristics, we construct information on both sides of the shock.
  2. As previously mentioned, the total flow, q=(1ρ)ρq=(1-\rho) \rho needs to be conserved, relative to the moving shock. The flow from the left into the shock must be equal to the flow from right away from the shock. The following theorem ensures that this is satisfied if ρ(x,t)\rho(x, t) is to be a weak solution to the PDE.

Theorem: Rankine-Hugoniot condition

If ρ(x,t)\rho(x, t) is a weak solution to the quasilinear traffic flow PDE (2.68), such that ρ\rho is discontinuous across the curve x=xs(t)x=x_{s}(t) but it is smooth on either side of xsx_{s}, then ρ\rho must satisfy the following condition:

xs=q(xs,t)q(xs+,t)ρ(xs,t)ρ(xs+,t),x_{s}^{\prime}=\frac{q\left(x_{s_{-}}, t\right)-q\left(x_{s_{+}}, t\right)}{\rho\left(x_{s_{-}}, t\right)-\rho\left(x_{s_{+}}, t\right)},

where xsx_{s}^{\prime} is the speed of the shock wave and qq evaluated at xsx_{s_{-}}and xs+x_{s_{+}} denotes the flux into and out of the moving shock, respectively. Equation (2.76) is known as the Rankine-Hugoniot formula for the shock speed.

For the traffic flow model described in this set of notes, the flux is given by q=(1ρ)ρq=(1-\rho) \rho. Now, let ρ=ρ(xs,t)\rho_{-}=\rho\left(x_{s_{-}}, t\right) and ρ+=ρ(xs+,t)\rho_{+}=\rho\left(x_{s_{+}}, t\right). Then, Eq. (2.76) implies that the shocks move at speeds which obey:

xs=dxsdt=(1ρ)ρq(xs,t)(1ρ+)ρ+q(xs+,t)ρρ+=1ρ+ρ;\begin{aligned} x_{s}^{\prime}=\frac{d x_{s}}{d t} & =\frac{\overbrace{\left(1-\rho_{-}\right) \rho_{-}}^{q\left(x_{s_{-}}, t\right)}-\overbrace{\left(1-\rho_{+}\right) \rho_{+}}^{q\left(x_{s_{+}}, t\right)}}{\rho_{-}-\rho_{+}} \\ & =1-\rho_{+}-\rho_{-} ; \end{aligned}

which implies that the shock speed for the traffic equation is determined by the density value on either side of the shock.