First order PDEs
In this section, we describe a general technique for solving first-order equations. We start our discussion with solutions to the transport equation to demonstrate the geometric concept behind PDE solution methods. We then move on to semilinear, first-order PDEs to introduce the idea behind the method of characteristics. These ideas can then be extended to the more complicated PDEs given by quasilinear and fully nonlinear cases. Before we continue however, we expand on the classification of the PDEs discussed in Sec. 2.1 giving a broader definition of the various classes of PDEs we can encounter.
- A PDE of order is quasilinear if it is linear in the derivatives of order with coefficients that depend on the independent variables and derivatives of the unknown function of order strictly less than .
- A quasilinear PDE where the coefficients of derivatives of order are functions of the independent variables alone is called semilinear.
- A PDE which is linear in the unknown function and all its derivatives with coefficients depending on the independent variables alone is called linear.
- A PDE which is not quasilinear is fully nonlinear.
The transport equation
The simplest transport equation takes the form,
where is a constant and . We seek functions that satisfy the transport equation. We approach this by looking a solution method via the directional derivative. The directional derivative of in the direction of a vector is given by:
where is the unit vector in the direction of . In particular, if the directional derivative in Eq. (2.19) is 0 at all points then is constant along all lines that are tangential to . This can be thought of as a generalisation of a partial derivative. Now, by rewriting the transport equation as a dot product, we have
Notice that the RHS of Eq. (2.20) is almost the directional derivative of in the direction of . To make it look exactly like the directional derivative, we can simply divide by , i.e.:
which is equivalent to Eq. (2.19). Hence, every solution must be constant along lines that are tangential to . In this case, the lines are also parallel to the vector (see Fig. 2.1). These lines, since they are parallel to , they have slope,
Upon integrating, we obtain:
where is a constant. Rearranging to give
we can see that the equation describes infinitely many lines which are tangential and parallel to the vector . Along each of these lines is constant and, so, depends only on . The general solution to the PDE therefore takes the form,
where is some arbitrary, differentiable function.
Method of characteristics
Consider the PDE:
where . Equation (2.24) is semilinear and it includes the linear form,
Figure 2.1: Characteristic curves; the PDE solution is constant along each curve in space and each curve is equal to a different constant value, . The initial data is propagated along these characteristic curves.
with . We also consider the initial condition given by:
Suppose we have found a solution to Eq. (2.24). Plotting the relationship of the solution as it varies with the independent variables and , gives a surface, . Let us denote this surface by , or equivalently,
for all real variables and . A surface is a solution to the PDE and is known as the integral surface. In Eq. (2.27), is a function of and where is a variable. For instance, if then . From vector calculus, we know that the gradient, gives a normal to the surface , i.e.:
This is a downward-pointing vector (because of the -1 term in the -direction), as shown in Fig. 2.2. Note that the PDE (2.24) may be rewritten in the following dot product form:
Equation (2.29) is the scalar product of two vectors; a zero dot product indicates that the two vectors are at right angles. We can write Eq.
(2.29) as:
The vector,
is normal to . Now, since is normal to the surface , this means that is tangential to the surface which, in turn, implies that lies in the tangent plane to (see Fig. 2.2).
Figure 2.2: The surface showing the normal to the surface, at a point and the vector [given by Eq. (2.31)] which lies in the tangent plane to .
As a result, the PDE dictates that any integral surface through a given point must be tangent to . This means that we start from an initial condition (which lies in the integral surface) and we move in the direction of ; since lies in the tangent plane, we move along a curve which lies within . This curve is referred to as the characteristic curve. The question is how do we go about constructing such a surface? We start by describing points on a characteristic curve through the parametrisation,
Differentiating with respect to gives the tangent vector
where the primes denote differentiation with respect to . This tangent vector, then, belongs to the tangent plane to the surface at a given point. Recall that the characteristic curve moves in the direction of and as such, and must be proportional. Therefore, we have
for some . Using Eqs. (2.31), (2.33) and (2.34), we can write down the following:
Or, in differential form:
From Eq. (2.36), we can form two pairs of ODEs:
or
The solutions to Eq. (2.37) [or Eq. (2.38)] determine the characteristics of the PDE. Example 2.1 Find a solution to the following semilinear PDE,
Solution We start off by writing the ODEs as in Eq. (2.36):
The objective here is to find a pair of differential equations which are easy to solve with basic ODE techniques. As mentioned previously, we have the two options given by Eqs. (2.37) and (2.38). We see that if we choose the first pair [Eq. (2.37)], we have:
which we can easily solve to obtain,
The second ODE in the pair yields:
which, upon integration, gives:
Now, remember that we are solving these ODEs along characteristic curves. Equation (2.44) gives us through (note that varies between characteristics but it remains constant along the same characteristic). The particular characteristic curve we are considering is given in terms of the independent variables and whose relationship is dependent on the constant . For this reason, and are related. At this point we don't exactly know how they are related so we introduce some arbitrary (but differentiable) function for their relationship:
Hence, using Eqs. (2.42), (2.44) and (2.45), we have:
which is an implicit, general solution to Eq. (2.55). In Example 2.1, the solution to the PDE is given in a general form. To determine the functionality of , we need to apply the initial condition given by where is a function of a single variable.
Summary of the method
Suppose we have a semilinear PDE of the general form given by Eq. (2.24). To solve using the method of characteristics:
- Consider the system:
The equations in (2.47) are equivalent to the PDE.
- Solve a pair of ODEs to obtain the functions and .
- In particular, solve the following ODE to obtain the function ,
- And the following for
- Relate the constants and through an arbitrary but differentiable function, , i.e. .
- The function can be chosen to satisfy an initial condition of the form where is a function of a single variable.
Quasilinear equations
Next, we look at quasilinear PDEs. Note that by Definition 2.2, a quasilinear PDE may also be linear or nonlinear since the definition places no restriction on the function itself. The method outlined above can be applied to quasilinear equations as well. Next, we walk through an example as the method can be a little trickier to use in more complicated cases.
Consider the quasilinear PDE of the following form,
Following Eq. (2.47), we have:
From Eqs. (2.51), if we can set up two first-order ODEs [i.e. analogous to the pairs in Eqs. (2.48) or (2.49)], then it is possible to obtain two equations of the form:
Just like in the case of semilinear PDEs (and as seen in Example 2.1) the constants and are related and hence the solution to the PDE (2.50), is given by:
where is an arbitrary (but differentiable) function. Example 2.2 Solve the following quasilinear PDE,
Solution Using Eq. (2.36), we write:
Recall the objective is to form first-order ODEs in the form given by Eqs. (2.48) and (2.49). The algebra in more complicated first-order PDEs tends to be more tedious. Here, we make use of the following:
Then, we have:
We now have the following ODE:
Rearranging gives:
which is equivalent to:
Similarly, we have:
which is:
Finally, we relate the two constants through Eq. (2.53), i.e. , to obtain the PDE solution as:
where is an arbitrary, differentiable function.
Traffic flow
We now consider an application, namely one-dimensional and one-directional traffic flow. We consider cars on a single-lane road with no entry or exit ramps such that we have conservation of cars. We take to denote a unique point on the road at time . While cars are discrete objects, we model this in a continuum sense using the notion of traffic density. To ensure that the model is valid, we assume that we are dealing on a large length scale. Then, since car flow is conserved, we can model the traffic flow with the following continuity equation,
The traffic density is denoted by and it represents the number of cars per unit length at position and time . The traffic flow or flux is given by at position and time and it represents the number of cars passing a fixed point in per unit time. Before we proceed, we need a constitutive relation between and ; we will get one by considering the speed at which the cars move. Suppose this is given by but this quantity dpes not represent a constant speed. Of course the speed at which cars move on the freeway is a factor of many things but, keeping things simple, let us say that this is predominantly be affected by the traffic density such that . If the traffic density were very low (i.e. very close to zero) then, the cars would be allowed to move at a maximum (hopefully, legal!) speed, say . The traffic density cannot reach infinity as this would be physically impossible and must therefore be bounded by a maximum value; this may be represented by how many cars can fit on an arbitrary length of the freeway when they are closely packed (i.e. bumper-to-bumper) together. As the traffic density approaches this maximum value, the speed approaches zero. Suppose this maximum value is denoted by . It follows that the relation we are looking for (i.e. as and as , may be of the following form,
Now, since represents the flux, it represents the amount of cars per unit time per unit area. This is therefore given by:
We take and for simplicity and, by differentiating Eq. (2.66) with respect to , we have:
Hence, Eq. (2.64) becomes,
This is a quasilinear PDE in so we may apply the method of characteristics. Remember these are curves in the -plane on which solutions to Eq. (2.68) are constant. If we now apply Eqs. (2.37) [note: we could of course apply Eq. (2.38) instead], we have the following ODEs:
Recall that and so the first equation in (2.69) does not help us significantly in finding the characteristics; however, the second equation saves the day. The second equation, tells us that the value of along the characteristics curves is constant, say which depends on the value of at time . Thus, the first equation in Eq. (2.69) becomes,
with constant which integrates to:
Equation (2.71) implies that the characteristics are straight lines with being the starting point. Different initial conditions therefore yield different characteristic curves (i.e. different slope). Since we typically plot against , it is handy to have an equation that relates to ; from Eq. (2.71), we have:
Figure 2.3 shows various characteristic curves; on the leftmost plot, we show a single curve passing through . Now, represents the initial distribution of the traffic density which varies according to the value of at . Let us call this function i.e. . At two distinct points, say and , the value of the density function is and , respectively. Suppose that . Graphically, this implies that the slope of the characteristic at is less steep than that of the characteristic at (see the middle plot in Fig. 2.3). If the value of decreases continuously as we move in the direction of increasing , the slopes (i.e. ) increase producing a somewhat 'fan'-like structure as shown in Fig. 2.3. Since the traffic density is less at than at , the traffic is moving faster at . It follows that there is a gradual separation between the traffic that started at and at .
Figure 2.3: Characteristic curves in the -plane. The curve emanates from the -axis satisfying a starting point (left). Two characteristic curves starting from initial points and (middle). Gradual separation between the points and , showing a 'fan'-like pattern (right).
A question that arises is the following: what if the traffic density starts low at and increases as increases? Then, the respective characteristic curves will be pointing inward and toward each other. The fate of these curves will ultimately be that they intersect. Now, recall that the model involves a single-lane road and therefore cars cannot pass other cars. If the speed is less at compared to the speed at , then cars are approaching the cars in front of them and are forced to reduce their speed to the lower speed the cars in the front maintain. We will see the effect this has on the solutions in the next subsection.
Finally, to conclude this subsection, we note that in order to compute the solution at any point, we find the characteristics using Eq. (2.72), follow it back to the initial point and determine . For the model we looked at in this section where we have heavier traffic to the left than to the right, a possible initial condition given by may be:
We make the following observation: recall that the maximum density was scaled by , therefore the initial constant density for is merely a quarter of the maximum. It is possible that the characteristic curves intersect at some point in time even if light traffic heads into heavier traffic.
Note that the slope is defined as the change of with ; however, we are plotting against and so in Fig. 2.3 the slope appears to be less at compared to .
Shocks
As mentioned in the previous subsection, it is possible that more than one characteristic passes through a given point in the -plane (note: this is a consequence of the nonlinear term given by ). This is in fact a very likely scenario leading to the overlap of the characteristics for sufficiently large time. The point at which characteristics first touch is called a shock wave or, simply, a shock. The term comes from gas dynamics where this phenomenon was first encountered.
Consider the traffic flow equation [given by Eq. (2.68)] with the following initial condition,
This represents a case where there is fast traffic coming from behind, there's a relatively slow moving traffic at the very front and a transition region where the traffic is slowing down - this is in . For such a case, at some point in time, the characteristics overlap (see Fig. 2.4). Once the characteristics intersect, the density function is multi-valued at the intersection point since the density function is composed of points where and . Now,
Figure 2.4: Characteristic curves in the -plane. The characteristics are pointing inward as faster-moving traffic from behind is catching up to slower-moving traffic in the front.
what happens in terms of the traffic density solution? As Eq. (2.183) indicates the traffic density starts low and gradually approaches the value of 0.25 within ; thereafter, the density remains constant at 0.25 at time . As time progresses, the solution gets steeper and steeper until it becomes vertical which coincides with when the shock wave develops. After the shock develops, the solution in the form we derived in in the previous section becomes invalid: it becomes a multi-valued function which poses a problem as we don't know which value to assign at the solution where the curves intersect. So what can be done? Two things: either we modify the model (which will consequently give a different solution) such that shocks are prevented or, modify the solution procedure. We proceed with the latter.
We give an alternative definition for the 'solution' to the PDE. More specifically, we allow the solution to become vertical at some point in time. This implies that at that point in time, the solution has a jump discontinuity (see notes from Topic B1 for definition). The solution at that point is no longer differentiable and hence cannot be a solution to the PDE in the classical sense. We can find ways around this though. We can think of a solution with a jump discontinuity as a shock, travelling in the direction of increasing as time increases (see Fig. 2.5). This 'solution' should be valid if we can show that the speed of the shock does not violate the principle of conservation (recall that in deriving the model we assume that the number of cars entering and leaving a certain length of the freeway, is constant). We can therefore divide the domain in two regions: upstream and downstream of the shock. A solution to a PDE that does not need to be continuous is referred to as a weak solution. A strong or classical solution is one that it is defined by a continuous function.
Figure 2.5: Shock formation. At a shock forms with a jump discontiniuity at . At a later time, , the jump discontinuity is at .
Suppose that a shock develops and moves with position and now, the solution has a jump discontinuity at resulting in the values of the traffic density at and at , (where and small) to be finite but not equal. We introduce the following notation: and . In particular, we have:
Away from the shock, the solution is described by a smooth function, . To determine the position and hence the speed of the shock wave, we demand the following:
- The original PDE describing the model, i.e. Eq. (2.68) needs to be satisfied on either side of the shock: at and for all time, . So, using the characteristics, we construct information on both sides of the shock.
- As previously mentioned, the total flow, needs to be conserved, relative to the moving shock. The flow from the left into the shock must be equal to the flow from right away from the shock. The following theorem ensures that this is satisfied if is to be a weak solution to the PDE.
Theorem: Rankine-Hugoniot condition
If is a weak solution to the quasilinear traffic flow PDE (2.68), such that is discontinuous across the curve but it is smooth on either side of , then must satisfy the following condition:
where is the speed of the shock wave and evaluated at and denotes the flux into and out of the moving shock, respectively. Equation (2.76) is known as the Rankine-Hugoniot formula for the shock speed.
For the traffic flow model described in this set of notes, the flux is given by . Now, let and . Then, Eq. (2.76) implies that the shocks move at speeds which obey:
which implies that the shock speed for the traffic equation is determined by the density value on either side of the shock.