WikiJournal Preprints/Cut the coordinates! (or Vector Analysis Done Fast)

WikiJournal Preprints
Open access • Publication charge free • Public peer review

WikiJournal User Group is a publishing group of open-access, free-to-publish, Wikipedia-integrated academic journals. <seo title=" Wikiversity Journal User Group, WikiJournal Free to publish, Open access, Open-access, Non-profit, online journal, Public peer review "/>

<meta name='citation_doi' value=>

Article information

Abstract

The gradient, the curl, the divergence, and the Laplacian are initially defined, without coordinates, as closed-surface integrals per unit volume—the definition of the Laplacian being indifferent to whether the operand is a scalar field or a vector field. Four integral theorems—including the divergence theorem—follow almost immediately, provided that the initial definitions are unambiguous. Their unambiguity, together with their usefulness, is established as follows, at a level suitable for beginners (although this abstract is for prospective instructors):
  • The gradient is related to an acceleration through an equation of motion;
  • The divergence is related to two time-derivatives of density (the partial derivative and the material derivative) through two forms of an equation of continuity;
  • The component of the curl in a general direction is expressed as a divergence (now known to be unambiguous);
  • The same is done for the general component of the gradient, yielding not only a second proof of unambiguity of the gradient, but also the relation between the gradient and the directional derivative; this together with the original definition of the Laplacian shows that the Laplacian of a scalar field is the divergence of the gradient and therefore unambiguous. The unambiguity of the Laplacian of a vector field then follows from a component argument (as for the curl) or from a linearity argument.

The derivation of the relation between the gradient and the directional derivative yields a coordinate-free definition of the dot-del operator for a scalar right-hand operand. But, as the directional derivative is also defined for a non-scalar operand, the same relation offers a method of generalizing the dot-del operator, so that the definition of the Laplacian of a general field can be rewritten with that operator. The advection operator—derived without coordinates, for both scalar and vector properties—is likewise rewritten.

Meanwhile comparison between the definitions of the various operators leads to coordinate-free definitions of the del-cross, del-dot, and del-squared operators. These together with the dot-del operator allow the four integral theorems to be condensed into a single generalized volume-integral theorem.

If the volume of integration is reduced to a thin curved slab of uniform thickness, with an edge-face perpendicular to the broad faces, the four integral theorems are reduced to their two-dimensional forms, each of which relates an integral over a surface segment to an integral around its enclosing curve, provided that the original closed-surface integral has no contribution from the broad faces of the slab. This proviso can be satisfied by construction in two of the four cases, yielding two general theorems, one of which is the Kelvin-Stokes theorem. Applying these two theorems to a segment of a closed surface, which expands to cover the entire surface, shows that the gradient is irrotational and the curl is solenoidal.

For all that, the gradient theorem is derived from the relation between the gradient and the directional derivative.

[To be continued.]


Introduction

edit

Sheldon Axler, in his essay "Down with determinants!" (1995) and his ensuing book Linear Algebra Done Right (4th Ed., 2023–), does not entirely eliminate determinants, but introduces them as late as possible and then exploits them for what he calls their "main reasonable use in undergraduate mathematics", namely the change-of-variables formula for multiple integrals.[1] Here I treat coordinates in vector analysis somewhat as Axler treats determinants in linear algebra: I introduce coordinates as late as possible, and then exploit them in unconventionally rigorous derivations of vector-analytic identities from vector-algebraic identities. But I contrast with Axler in at least two ways. First, as my subtitle suggests, I have no intention of expanding my paper into a book. Brevity is of the essence. Second, while one may well avoid determinants in numerical linear algebra,[2] one can hardly avoid coordinates in numerical vector analysis! So I cannot extend the coordinate-minimizing path into computation. But I can extend it up to the threshold by expressing the operators of vector analysis in a suitably general coordinate system, leaving others to specialize it and compute with it. On the way, I can satisfy readers who need the concepts of vector analysis for theoretical purposes, and who would rather read a paper than a book.

The cost of coordinates

edit

Mathematicians define a "vector" as a member of a vector space, which is a set whose members satisfy certain basic rules of algebra (called the vector-space axioms) with respect to another set (called a field), which has its own basic rules of algebra (the field axioms), and whose members are called "scalars". Physicists are more fussy. They typically want a "vector" to be not only a member of a vector space, but also a first-order tensor : a "tensor", meaning that it has an existence independent of any coordinate system with which it might be specified; and "first-order" (or "first-degree", or "first-rank"), meaning that it is specified by a one-dimensional array of numbers. Similarly, a 2nd-order tensor is specified by a 2-dimensional array (a matrix), and a 3rd-order by a 3-dimensional array, and so on; and a "scalar", being specified by a single number (a zero-dimensional array), is a zero-order tensor. In "vector analysis", we are greatly interested in applications to physical situations, and accordingly take the physicists' view on what constitutes a vector or a scalar.

So, for our purposes, defining a quantity by three components in (say) a Cartesian coordinate system is not enough to make it a vector, and defining a quantity as a real function of a list of coordinates is not enough to make it a scalar, because we still need to show that the quantity has an independent existence. One way to do this is to show that its coordinate representation behaves appropriately when the coordinate system is changed. (But don't worry if the following details look cryptic, because we won't be using them!) Independent existence of a quantity means that its coordinate representation is contravariant—that is, the representation changes so as to compensate for the change in the coordinate system.[a] But independent existence of an operator means that its coordinate representation is covariant—that is, the representation of the operator in the coordinate system, with the operand(s) and the result in that system, has the same form in one coordinate system as in another (except for features internal to the system).[b]

Here we circumvent these complications by the most obvious route: by initially defining things without coordinates. If, having defined something without coordinates, we then need to represent it with coordinates, we can choose the coordinate system for convenience.

The limitations of limits

edit

In the branch of pure mathematics known as analysis, there is a thing called a limit, whereby for every positive ϵ  there exists a positive δ such that if some increment is less than δ, some error is less than ϵ. In the branch of applied mathematics known as continuum mechanics, there is a thing called reality, whereby if the increment is less than some positive δ, the assumption of a continuum becomes ridiculous, so that the error cannot be made less than an arbitrary ϵ. Yet vector "analysis" (together with higher-order tensors) is typically studied with the intention of applying it to some form of "continuum" mechanics, such as the modeling of elasticity, plasticity, fluid flow, or (widening the net) electrodynamics of ordinary matter; in short, it is studied with the intention of conveniently forgetting that, on a sufficiently small scale, matter is lumpy.[c] One might therefore submit that to express the principles of vector analysis in the language of limits is to strain at a gnat and swallow a camel. Here I avoid that camel by referring to elements of length or area or volume, each of which is small enough to allow some quantity or quantities to be considered uniform within it, but, for the same reason, large enough to allow such local averaging of the said quantity or quantities as is necessary to tune out the lumpiness.

We shall see bigger camels, where well-known authors define or misdefine a vector operator and then want to treat it like an ordinary vector (a quantity). These I also avoid.

Prerequisites

edit

I assume that the reader is familiar with the algebra and geometry of vectors in 3D space, including the dot-product, the cross-product, and the scalar triple product, their geometric meanings, their expressions in Cartesian coordinates, and the identity

a × (b × c)  =  a⸱ c ba⸱b c ,

which we call the "expansion" of the vector triple product.[3] I further assume that the reader can generalize the concept of a derivative, so as to differentiate a vector with respect to a scalar, e.g.

 

or so as to differentiate a function of several independent variables "partially" w.r.t. one of them while the others are held constant, e.g.

 

But in view of the above remarks on limits, I also expect the reader to be tolerant of an argument like this: In a short time dt, let the vectors r and p change by   and  , respectively. Then

 

where, as always, the orders of the cross-products matter.[d] Differentiation of a dot-product behaves similarly, except that the orders don't matter; and if  p = mv, where m is a scalar and v is a vector, then

 

Or an argument like this:  If  , then

 

that is, we can switch the order of partial differentiation. If x is an abbreviation for /∂x, etc., this rule can be written in operational terms as

x y = ∂y x .

More generally, if i is an abbreviation for /∂xi where i = 1, 2,…, the rule becomes

i j = ∂j i .

These generalizations of differentiation, however, do not go beyond differentiation w.r.t. real variables, some of which are scalars, and some of which are coordinates. Vector analysis involves quantities that may be loosely described as derivatives w.r.t. a vector—usually the position vector.

Closed-surface integrals per unit volume

edit

The term field, mentioned above in the context of algebraic axioms, has an alternative meaning: if r is the position vector, a scalar field is a scalar-valued function of r, and a vector field is a vector-valued function of r; both may also depend on time. These are the functions of which we want "derivatives" w.r.t. the vector r.

In this section I introduce four such derivatives—the gradient, the curl, the divergence, and the Laplacian—in a way that will seem unremarkable to those readers who aren't already familiar with them, but idiosyncratic to those who are. The gradient is commonly introduced in connection with a curve and its endpoints, the curl in connection with a surface segment and its enclosing curve, the divergence in connection with a volume and its enclosing surface, and the Laplacian as a composite of two of the above, initially applicable only to a scalar field. Here I introduce all four in connection with a volume and its enclosing surface; and I introduce the Laplacian as a concept in its own right, equally applicable to a scalar or vector field, and only later relate it to the others. My initial definitions of the gradient, the curl, and the Laplacian, although not novel, are usually thought to be more advanced than the common ones—in spite of being conceptually simpler, and in spite of being obvious variations on the same theme.

Instant integral theorems—with a caveat

edit

Let V be a volume (3D region) enclosed by a surface S (a mathematical surface, not generally a physical barrier). Let n̂ be the unit normal vector at a general point on S, pointing out of V. Let n be the distance from S in the direction of n̂ (positive outside V, negative inside), and let n be an abbreviation for /∂n, where the derivative—commonly called the normal derivative—is tacitly assumed to exist.

In V, and on S, let p be a scalar field (e.g., pressure in a fluid, or temperature), and let q be a vector field (e.g., flow velocity, or heat-flow density), and let ψ be a generic field which may be a scalar or a vector. Let a general element of the surface S have area dS, and let it be small enough to allow n̂, p, q, and n ψ to be considered uniform over the element. Then, for every element, the following four products are well defined:

 

 

 

 

 

(1)

If p is pressure in a non-viscous fluid, the first of these products is the force exerted by the fluid in V  through the area dS. The second product does not have such an obvious physical interpretation; but if q is circulating clockwise about an axis directed through V, the cross-product will be exactly tangential to S and will tend to have a component in the direction of that axis. The third product is the flux of q through the surface element; if q is flow velocity, the third product is the volumetric flow rate (volume per unit time) out of V  through dS ; or if q is heat-flow density, the third product is the heat transfer rate (energy per unit time) out of V  through dS. The fourth product, by analogy with the third, might be called the flux of the normal derivative of ψ through the surface element, but is equally well defined whether ψ is a scalar or a vector—or, for that matter, a matrix, or a tensor of any order, or anything else that we can differentiate w.r.t. n.

If we add up each of the four products over all the elements of the surface S, we obtain, respectively, the four surface integrals

 

 

 

 

 

(2)

in which the double integral sign indicates that the range of integration is two-dimensional. The first surface integral takes a scalar field and yields a vector; the second takes a vector field and yields a vector; the third takes a vector field and yields a scalar; and the fourth takes (e.g.) a scalar field yielding a scalar, or a vector field yielding a vector. If p is pressure in a non-viscous fluid, the first integral is the force exerted by the fluid in V  on the fluid outside V. The second integral may be called the skew surface integral of q over S ,[4] or, for the reason hinted above, the circulation of q over S.  The third integral, commonly called the flux integral (or simply the surface integral) of q over S, is the total flux of q out of V. And the fourth integral is the surface integral of the outward normal derivative of ψ.

Let the volume V  be divided into elements. Let a general volume element have the volume dV and be enclosed by the surface δS —not to be confused with the area dS of a surface element, which may be an element of S or of δS. Now consider what happens if, instead of evaluating each of the above surface integrals over S, we evaluate it over each δS and add up the results for all the volume elements. In the interior of V, each surface element of area dS is on the boundary between two volume elements, for which the unit normals n̂ at dS, and the respective values of n ψ, are equal and opposite. Hence when we add up the integrals over the surfaces δS, the contributions from the elements dS cancel in pairs, except on the original surface S, so that we are left with the original integral over S. So, for the four surface integrals in (2), we have respectively

 

 

 

 

 

(3)

Now comes a big "if":  if  we define the gradient of p (pronounced "grad p") as

 

 

 

 

 

(4g)

and the curl of q as

 

 

 

 

 

(4c)

and the divergence of q as

 

 

 

 

 

(4d)

and the Laplacian of ψ as [e]

 

 

 

 

 

(4L)

(where the letters after the equation number stand for gradient, curl, divergence, and Laplacian, respectively), then equations (3) can be rewritten

 

But because each term in each sum has a factor dV, we call the sum an integral; and because the range of integration is three-dimensional, we use a triple integral sign. Thus we obtain the following four theorems relating integrals over an enclosing surface S  to integrals over the enclosed volume V :

 

 

 

 

 

(5g)

 

 

 

 

 

(5c)

 

 

 

 

 

(5d)

 

 

 

 

 

(5L)

Of the above four results, only the third (5d) seems to have a standard name; it is called the divergence theorem (or Gauss's theorem or, more properly, Ostrogradsky's theorem[5]), and is indeed the best known of the four—although the other three, having been derived in parallel with it, may be said to be equally fundamental.

As each of the operators , curl, and div calls for an integration w.r.t. area and then a division by volume, the dimension (or unit of measurement) of the result is the dimension of the operand divided by the dimension of length, as if the operation were some sort of differentiation w.r.t. position. Moreover, in each of equations (5g) to (5d), there is a triple integral on the right but only a double integral on the left, so that each of the operators , curl, and div appears to compensate for a single integration. For these reasons, and for convenience, we shall describe them as differential operators. By comparison, the  operator in (4L) or (5L) calls for a further differentiation w.r.t. n ; we shall therefore describe as a 2nd-order differential operator. (Another reason for these descriptions will emerge in due course.) As promised, the four definitions (4g) to (4L) are "obvious variations on the same theme" (although the fourth is somewhat less obvious than the others).

But remember the "if": Theorems (5g) to (5L) depend on definitions (4g) to (4L) and are therefore only as definite as those definitions! Equations (3), without assuming anything about the shapes and sizes of the closed surfaces δS (except, tacitly, that n̂ is piecewise well-defined), indicate that the surface integrals are additive with respect to volume. But this additivity, by itself, does not guarantee that the surface integrals are shared among neighboring volume elements in proportion to their volumes, as envisaged by "definitions" (4g) to (4L). Each of these "definitions" is unambiguous if, and only if, the ratio of the surface integral to dV  is insensitive to the shape and size of δS for a sufficiently small δS. Notice that the issue here is not whether the ratios specified in equations (4g) to (4L) are true vectors or scalars, independent of the coordinates; all of the operations needed in those equations have coordinate-free definitions. Rather, the issue is whether the resulting ratios are unambiguous notwithstanding the ambiguity of δS, provided only that δS is sufficiently small. That is the advertised "caveat", which must now be addressed.

In accordance with our "applied" mathematical purpose, our proofs of the unambiguity of the differential operators will rest on a few thought experiments, each of which applies an operator to a physical field, say f, and obtains another physical field whose unambiguity is beyond dispute. The conclusion of the thought experiment is then applicable to any operand field whose mathematical properties are consistent with its interpretation as the physical field f ; the loss of generality, if any, is only what is incurred by that interpretation.

Unambiguity of the gradient

edit

Suppose that a fluid with density ρ (a scalar field) flows with velocity v (a vector field) under the sole influence of the internal pressure p (a scalar field). Then the integral in (4g) is the force exerted by the fluid inside δS on the fluid outside, so that minus the integral is the force exerted on the fluid inside δS. Dividing by dV, we find that −∇p, as defined by (4g), is the force per unit volume,[6] which is the acceleration times the mass per unit volume; that is,

 

 

 

 

 

(6g)

Now provided that the left side of this equation is locally continuous, it can be considered uniform inside the small δS, so that the left side is unambiguous, whence p is also unambiguous. If there are additional forces on the fluid element, e.g. due to gravity and/or viscosity, then −∇p is not the sole contribution to density-times-acceleration, but is still the contribution due to pressure, which is still unambiguous.

By showing the unambiguity of definition (4g), we have confirmed theorem (5g). In the process we have seen that the volume-based definition of the gradient is useful for the modeling of fluids, and intuitive in that it formalizes the common notion that a pressure "gradient" gives rise to a force.

Unambiguity of the divergence

edit

In the aforesaid fluid, in a short time dt, the volume that flows out of fixed closed surface δS  through a fixed surface element of area dS  is vdt⸱ n̂ dS.  Multiplying by density and integrating over δS, we find that the mass flowing out of δS  in time dt is   .  Dividing this by dV, and then by dt, we get the rate of reduction of density inside δS ; that is,

 

where the derivative w.r.t. time is evaluated at a fixed location (because δS is fixed), and is therefore written as a partial derivative (because other variables on which ρ might depend—namely the coordinates—are held constant). Provided that the right-hand side is locally continuous, it can be considered uniform inside δS and is therefore unambiguous, so that the left side is likewise unambiguous. But the left side is simply div ρv  as defined by (4d),[f] which is therefore also unambiguous,[7] confirming theorem (5d). In short, the divergence operator is that which maps ρv to the rate of reduction of density at a fixed point:

 

 

 

 

 

(7d)

This result, which expresses conservation of mass, is a form of the so-called equation of continuity.

The partial derivative ∂ρ/∂t in (7d) must be distinguished from the material derivative /dt, which is evaluated at a point that moves with the fluid.[g] [Similarly, dv/dt in (6g) is the material acceleration, because it is the acceleration of the mobile mass—not of a fixed point! ]  To re-derive the equation of continuity in terms of the material derivative, the volume vdt⸱ n̂ dS, which flows out through dS in time dt (as above), is integrated over δS to obtain the increase in volume of the mass initially contained in dV. Dividing this by the mass, ρ dV, gives the increase in specific volume (1/ρ) of that mass, and then dividing by dt gives the rate of change of specific volume; that is,

 

Multiplying by ρ² and comparing the left side with (4d), we obtain

 

 

 

 

 

(7d')

Whereas (7d) shows that div ρv is unambiguous, (7d') shows that div v is unambiguous (provided that the right-hand sides are locally continuous). In accordance with the everyday meaning of "divergence", (7d') also shows that div v is positive if the fluid is expanding (ρ decreasing), negative if it is contracting (ρ increasing), and zero if it is incompressible. In the last case, the equation of continuity reduces to

 [for an incompressible fluid].

 

 

 

 

(7i)

For incompressible flow, any tubular surface tangential to the flow velocity, and consequently with no flow in or out of the "tube", has the same volumetric flow rate across all cross-sections of the "tube", as if the surface were the wall of a pipe full of liquid (except that the surface is not necessarily stationary). Accordingly, a vector field with zero divergence is described as solenoidal (from the Greek word for "pipe"). More generally, a solenoidal vector field has the property that for any tubular surface tangential to the field, the flux integrals across any two cross-sections of the "tube" are the same—because otherwise there would be a net flux integral out of the closed surface comprising the two cross-sections and any segment of tube between them, in which case, by the divergence theorem (5d), the divergence would have to be non-zero somewhere inside, contrary to (7i).

Unambiguity of the curl (and gradient)

edit

The unambiguity of the curl (4c) follows from the unambiguity of the divergence. Taking dot-products of (4c) with an arbitrary constant vector b, we get

 

that is, by (4d),

 [for uniform b].

 

 

 

 

(8c)

(The parentheses on the right, although helpful because of the spacing, are not strictly necessary, because the alternative binding would be (div q), which is a scalar, whose cross-product with the vector b is not defined. And the left-hand expression does not need parentheses, because it can only mean the dot-product of a curl with the vector b; it cannot mean the curl of a dot-product, because the curl of a scalar field is not defined.) This result (8c) is an identity if the vector b is independent of location, so that it can be taken inside or outside the surface integral; thus b may be a uniform vector field, and may be time-dependent. If we make b a unit vector, the left side of the identity is the (scalar) component of curl q in the direction of b, and the right side is unambiguous. Thus the curl is unambiguous because its component in any direction is unambiguous. This confirms theorem (5c).

Similarly, the unambiguity of the divergence implies the unambiguity of the gradient. Starting with (4g), taking dot-products with an arbitrary uniform vector b, and proceeding as above, we obtain

 [for uniform b].

 

 

 

 

(8g)

(The left-hand side does not need parentheses, because it can only mean the dot-product of a gradient with the vector b; it cannot mean the gradient of the dot-product of a scalar field with a vector field, because that dot-product would not be defined.) If we make b a unit vector, this result (8g) says that the (scalar) component of p in the direction of b is given by the right-hand side, which again is unambiguous. So here we have a second explanation of the unambiguity of the gradient: like the curl, it is unambiguous because its component in any direction is unambiguous.

We might well ask what happens if we take cross-products with b on the left, instead of dot-products. If we start with (4g), the process is straightforward: in the end we can switch the order of the cross-product on the left, and change the sign on the right, obtaining

 [for uniform b].

 

 

 

 

(8p)

(Again no parentheses are needed.) If we start with (4c) instead, and take b inside the integral, we get a vector triple product to expand, which leads to

 

 

 

 

 

(8q)

Here the first term on the right is simply  ∇ b⸱q  (the gradient of the dot-product). The second term is more problematic. If we had a scalar p instead of the vector q, we could take b outside the second integral, so that the second term would be (minus) b ⸱ ∇p. This suggests that the actual second term should be (minus) b ⸱ ∇q. But we do not yet know how to interpret that expression for a vector field q; and if we were to adopt the second term above (without the sign) as the definition of b⸱∇ q (treating b⸱ as an operator), that would be open to the objection that b⸱∇ q had been defined only for uniform b, whereas b ⸱ ∇p (for scalar p) is defined whether b is uniform or not. So, for the moment, let us put (8q) aside and run with (8c), (8g), and (8p).

Another meaning of the gradient

edit

Let ŝ be a unit vector in a given direction, and let s be a parameter measuring distance (arc length) along a path in that direction. By equation (8g) and definition (4d), we have

 

where, by the unambiguity of the divergence, the shape of the closed surface δS enclosing dV  can be chosen for convenience. So let δS be a right cylinder with cross-sectional area α  and perpendicular height ds , with the path passing perpendicularly through the end-faces at parameter-values s and s+ds , where the outward unit normal n̂ consequently takes the values ŝ and ŝ , respectively. And let the cross-sectional dimensions be small compared with ds  so that the values of p at the end-faces, say p and p+dp, can be taken to be the same as where the end-faces cut the path. Then  dV = α ds , and the surface integral over δS includes only the contributions from the end-faces (because n̂ is perpendicular to ŝ elsewhere); those contributions are respectively    and     i.e.    and  .  With these substitutions the above equation becomes

 

that is,

 

 

 

 

 

(9g)

where the right-hand side, commonly called the directional derivative of p in the ŝ direction,[8] is the derivative of p w.r.t. distance in that direction. Although (9g) has been obtained by taking that direction as fixed, the equality is evidently maintained if s measures arc length along any path tangential to ŝ at the point of interest.

Equation (9g) is an alternative definition of the gradient: it says that the gradient of    is the vector whose component in any direction is the directional derivative of    in that direction. For real   , this component has its maximum, namely |p| , in the direction of p; thus the gradient of    is the vector whose direction is that in which the derivative of    w.r.t. distance is a maximum, and whose magnitude is that maximum. This is the usual conceptual definition of the gradient.[9] Sometimes it is convenient to work directly from this definition. For example, in Cartesian coordinates (x, y, z), if a scalar field is given by x , its gradient is obviously the unit vector in the direction of the x axis, usually called i; that is, x = i. Similarly, if  r = rr̂  is the position vector, then r = r̂.

If  ŝ is tangential to a level surface of p (a surface of constant p), then s p  in that direction is zero, in which case (9g) says that p (if not zero) is orthogonal to ŝ.  So   is orthogonal to the surfaces of constant    (as we would expect, having just shown that the direction of p is that in which p varies most steeply).

If p is uniform—that is, if it has no spatial variation—then its derivative w.r.t. distance in every direction is zero; that is, the component of p in every direction is zero, so that p must be the zero vector. In short, the gradient of a uniform scalar field is zero.

Unambiguity of the Laplacian

edit

Armed with our new definition of the gradient (9g), we can revisit our definition of the Laplacian (4L). If ψ is a scalar field, then, by (9g),    can be replaced by   in (4L), which then becomes

 

 

 

 

 

(9L)

that is, by definition (4d),

 [for scalar ψ].

So the Laplacian of a scalar field is the divergence of the gradient. This is the usual introductory definition of the Laplacian—and on its face is applicable only in the case of a scalar field. The unambiguity of the Laplacian, in this case, follows from the unambiguity of the divergence and the gradient.

If, on the contrary, ψ in definition (4L) is a vector field, then we can again take dot-products with a uniform vector b, obtaining

 

If we make b a unit vector, this says that the scalar component of the Laplacian of a vector field, in any direction, is the Laplacian of the scalar component of that vector field in that direction. As we have just established that the latter is unambiguous, so is the former.

But the unambiguity of the Laplacian can be generalized further. If

 

where each   is a scalar field, and each αi is a constant, and the counter i ranges from (say) 1 to k , then it is clear from (4L) that

 

 

 

 

 

(10)

In words, this says that the Laplacian of a linear combination of fields is the same linear combination of the Laplacians of the same fields—or, more concisely, that the Laplacian is linear. I say "it is clear" because the Laplacian as defined by (4L) is itself a linear combination, so that (10) merely asserts that we can regroup the terms of a nested linear combination; the gradient, curl, and divergence as defined by (4g) to (4d) are likewise linear. It follows from (10) that the Laplacian of a linear combination of fields is unambiguous if the Laplacians of the separate fields are unambiguous. Now we have supposed that the fields   are scalar and that the coefficients αi are constants. But the same logic applies if the "constants" are uniform basis vectors (e.g., i, j,k), so that the "linear combination" can represent any vector field, whence the Laplacian of any vector field is unambiguous. And the same logic applies if the "constants" are chosen as a "basis" for a space of tensors of any order, so that the Laplacian of any tensor field of that order is unambiguous, and so on. In short, the Laplacian of any field that we can express with a uniform basis is unambiguous.

The dot-del, del-cross, and del-dot operators

edit

The gradient operator is also called del.[h] If it simply denotes the gradient, we tend to pronounce it "grad" in order to emphasize the result. But it can also appear in combination with other operators to give other results, and in those contexts we tend to pronounce it "del".

One such combination is "dot del"— as in "b⸱∇ ", which we proposed after (8q), but did not manage to define satisfactorily for a vector operand. With our new definition of the gradient (9g), we can now make a second attempt. A general vector field q can be written |q| q̂ , so that

 

If ψ is a scalar field, we can apply (9g) to the right-hand side, obtaining

 

where sq is distance in the direction of q. For scalar ψ, this result is an identity between previously defined quantities. For non-scalar ψ, we have not yet defined the left-hand side, but the right-hand side is still well-defined and self-explanatory (provided that we can differentiate ψ w.r.t. sq). So we are free to adopt

 

 

 

 

 

(11)

where sq is distance in the direction of q , as the general definition of the operator q⸱∇ , and to interpret it as defining both a unary operator  q⸱ which operates on a generic field, and a binary operator  which takes a (possibly uniform) vector field on the left and a generic field on the right.

For the special case in which q is a unit vector ŝ , with s measuring distance in the direction of  ŝ , definition (11) reduces to

 

 

 

 

 

(12)

which agrees with (9g) but now holds for a generic field ψ [whereas (9g) was for a scalar field, and was derived as a theorem based on earlier definitions]. So ŝ∇ , with a unit vector s , is the directional-derivative operator on a generic field.

In particular, if  ŝ = n̂  we have

 

which we may substitute into the original definition of the Laplacian (4L) to obtain

 

 

 

 

 

(13L)

which is just (9L) again, except that it now holds for for a generic field.

If our general definition of the gradient (4g) is also taken as the general definition of the operator,[10] then, comparing (4g) with (4c), (4d), and (13L), we see that

 

where the parentheses may seem to be required on account of the closing dS in (4g). But if we write the factor dS before the integrand, the del operator in (4g) becomes

 

if  we insist that it is to be read as a operator looking for an operand, and not as a self-contained expression. Then, if we similarly bring forward the dS in (4c), (4d), and (13L), the respective operators become

 

 

 

 

 

(14)

(pronounced "del cross", "del dot", and "del dot del"), of which the last is usually abbreviated as 2  ("del squared").[i] Because these operational equivalences follow from coordinate-free definitions, they must remain valid when correctly expressed in any coordinate system.[j] That does not mean that they are always convenient or always conducive to the avoidance of error—of which we shall have more to say in due course. But they sometimes make useful mnemonic devices. For example, they let us rewrite identities (8c), (8g), and (8p) as

 for uniform b.

 

 

 

 

(15)

These would be basic algebraic vector identities if  were an ordinary vector, and one could try to derive them from the "algebraic" behavior of ; but they're not, because it isn't, so we didn't !  Moreover, these simple "algebraic" rules are for a uniform b, and do not of themselves tell us what to do if b is spatially variable; for example, (8g) is not applicable to (7d).

The advection operator

edit

Variation or transportation of a property of a medium due to motion with the medium is called advection (which, according to its Latin roots, means "carrying to"). Suppose that a medium (possibly a fluid) moves with a velocity field v in some inertial reference frame. Let ψ be a field (possibly a scalar field or a vector field) expressing some property of the medium (e.g., density, or acceleration, or stress,[k]… or even v itself). We have seen that the time-derivative of ψ may be specified in two different ways: as the partial derivative ∂ψ/∂t , evaluated at a fixed point (in the chosen reference frame), or as the material derivative /dt, evaluated at a point moving at velocity v (i.e., with the medium). The difference  /dt − ∂ψ/∂t is due to motion with the medium. To find another expression for this difference, let s be a parameter measuring distance along the path traveled by a particle of the medium. Then, for a short time interval dt, the surface representing the small change in ψ (or each component thereof) as a function of the small changes in t and s  (plotted on perpendicular axes) can be taken as a plane through the origin, so that

 

that is, the change in ψ is the sum of the changes due to the change in t and the change in s . Dividing by dt gives

 

i.e.,

 

(and the first term on the right could have been written t ψ). So the second term on the right is the contribution to the material derivative due to motion with the medium; it is called the advective term, and is non-zero wherever a particle of the medium moves along a path on which ψ varies with location—even if ψ at each location is constant over time.  So the operator  |v| s , where s measures distance along the path, is the advection operator : it maps a property of a medium to the advective term in the time-derivative of that property. If ψ is v itself, the above result becomes

 

where the left-hand side (the material acceleration) is as given by Newton's second law, and the first term on the right (which we might call the "partial" acceleration) is the time-derivative of velocity in the chosen reference frame, and the second term on the right (the advective term) is the correction that must be added to the "partial" acceleration in order to obtain the material acceleration. This term is non-zero wherever velocity is non-zero and varies along a path, even if the velocity at each point on the path is constant over time (as when water speeds up while flowing at a constant volumetric rate into a nozzle). Paradoxically, while the material acceleration and the "partial" acceleration are apparently linear (first-degree) in v, their difference (the advective term) is not. Thus the distinction between ∂ψ/∂t and /dt  has the far-reaching implication that fluid dynamics is non-linear.

Applying (11) to the last two equations, we obtain respectively

 

 

 

 

 

(16)

and

 

 

 

 

 

(16v)

where, in each case, the second term on the right is the advective term. So the advection operator can also be written  v⸱∇ .

When the generic ψ  in (16) is replaced by the density ρ , we get a relation between ∂ρ/∂t and /dt, both of which we have seen before—in equations (7d) and (7d') above. Substituting from those equations then gives

 

 

 

 

 

(17)

where ρ can be taken as a gradient since ρ is scalar. This result is in fact an identity—a product rule for the divergence—as we shall eventually confirm by another method.

Generalized volume-integral theorem

edit

We can rewrite the fourth integral theorem (5L) in the "dot del" notation as

 

 

 

 

 

(18L)

Then, using notations (14), we can condense all four integral theorems (5g), (5c), (5d), and (18L) into the single equation

 

 

 

 

 

(19)

where the "circ" symbol is a generic binary operator which may be replaced by a null (direct juxtaposition of the operands) for theorem (5g), or a cross for (5c), or a dot for (5d), or  for (18L). This single equation is a generalized volume-integral theorem, relating an integral over a volume to an integral over its enclosing surface.[l]

Theorem (19) is based on the following definitions, which have been found unambiguous:

  • the gradient of a scalar field p is the closed-surface integral of  n̂ p per unit volume, where n̂ is the outward unit normal;
  • the divergence of a vector field is the outward flux integral per unit volume;
  • the curl of a vector field is the skew surface integral per unit volume, also called the surface circulation per unit volume; and
  • the Laplacian is the closed-surface integral of the outward normal derivative, per unit volume.

The gradient maps a scalar field to a vector field; the divergence maps a vector field to a scalar field; the curl maps a vector field to a vector field; and the Laplacian maps a scalar field to a scalar field, or a vector field to a vector field, etc.

The gradient of p, as defined above, has been shown to be also

  • the vector whose component in any direction is the directional derivative of p in that direction (i.e. the derivative of p w.r.t. distance in that direction), and
  • the vector whose direction is that in which the directional derivative of p is a maximum, and whose magnitude is that maximum.

Consistent with these alternative definitions of the gradient, we have defined the   operator so that  ŝ (for a unit vector ŝ) is the operator yielding the directional derivative in the direction of  ŝ , and we have used that notation to bring theorem (5L) under theorem (19).

So far, we have said comparatively little about the curl. That imbalance will now be rectified.

Closed-circuit integrals per unit area

edit

Instant integral theorems—on a condition

edit

Theorems (5g) to (5L) are three-dimensional: each of them relates an integral over a volume V  to an integral over its enclosing surface S. We now seek analogous two-dimensional theorems, each of which relates an integral over a surface segment to an integral around its enclosing curve. For maximum generality, the surface segment should be allowed to be curved into a third dimension.[m] Theorems of this kind can be obtained as special cases of theorems (5g) to (5L) by suitably choosing V and S ; this is another advantage of our "volume first" approach.

Let Σ be a surface segment enclosed by a curve C (a circuit or closed contour), and let l be a parameter measuring arc length around C , so that a general element of C has length dl ; and let a general element of the surface Σ have area . Let   be the unit normal vector at a general point on Σ , and let t ̂ be the unit tangent vector to C at a general point on C in the direction of increasing l. In the original case of a surface enclosing a volume, we had to decide whether the unit normal pointed into or out of the volume (we chose the latter). In the present case of a circuit enclosing a surface segment, we have to decide whether l is measured clockwise or counterclockwise as seen when looking in the direction of the unit normal, and we choose clockwise. So l is measured clockwise about  , and C is traversed clockwise about  .

From Σ  we can construct obvious candidates for V and S. From every point on Σ , erect a perpendicular with a uniform small height h in the direction of  . Then simply let V be the volume occupied by all the perpendiculars, and let S be its enclosing surface. Thus V is a (generally curved) thin slab of uniform thickness h, whose enclosing surface S consists of two close parallel (generally curved) broad faces connected by a perpendicular edge-face of uniform height h ; and we can treat   as a vector field by extrapolating it perpendicularly from Σ. If we can arrange for h to cancel out, the volume V will serve as a 3D representation of the surface segment Σ while the edge-face will serve as a 2D representation of the curve C , so that our four theorems will relate an integral around C to an integral over Σprovided that there is no contribution from the broad faces to the integral over S. For brevity, let us call this proviso the 2D condition.

If  the 2D condition is satisfied, an integral over the new S reduces to an integral over the edge-face, on which

 

so that the cancellation of h will leave an integral over C  w.r.t. length. Meanwhile, in an integral over the new V, regardless of the 2D condition, we have

 

so that the cancellation of h will leave an integral over Σ w.r.t. area. So, substituting for dS and dV  in (5g) to (5L), and canceling h as planned, we obtain respectively

 

 

 

 

 

(20g)

 

 

 

 

 

(20c)

 

 

 

 

 

(20d)

 

 

 

 

 

(20L)

all subject to the 2D condition. In each equation, the circle on the left integral sign acknowledges that the integral is around a closed loop. The unit vector n̂ , which was normal to the edge-face, is now normal to both t ̂ and  ; that is, n̂ is tangential to the surface segment Σ and projects perpendicularly outward from its bounding curve.

On the left side of (20g), the 2D condition is satisfied if (but not only if) n̂p takes equal-and-opposite values at any two opposing points on opposing broad faces of S , i.e. if p takes the same value at such points, i.e. if p has a zero directional derivative normal to Σ , i.e. if p has no component normal to Σ. Thus a sufficient "2D condition" for (20g) is the obvious one.

Skipping forward to (20L), we see that the 2D condition is satisfied if    takes equal-and-opposite values at any two opposing points on opposing broad faces of S , i.e. if    (where   measures distance in the direction of  ) takes the same value at such points, i.e. if   .

For (20c) and (20d), the 2D constraint can be satisfied by construction, with more useful results—as explained under the next two headings. To facilitate this process, we first make a minor adjustment to Σ and C. Noting that any curved surface segment can be approximated to any desired accuracy by a polyhedral surface enclosed by a polygon, we shall indeed consider Σ to be a polyhedral surface made up of small planar elements, being the area of a general element, and we shall indeed consider C to be a polygon with short sides, dl being the length of a general side.[n] The benefit of this trick, as we shall see, is to make the unit normal   uniform over each surface element, without forcing us to treat q (or any other field) as uniform over the same element. But, as the elements of C can independently be made as short as we like (dividing straight sides into shorter elements if necessary!), we can still consider   q , and t ̂ to be uniform over each element of C.

Special case for the gradient

edit

In (20c), the 2D condition is satisfied by    (where p is a scalar field), because then the integrand on the left is zero on the broad faces of S , where n is parallel to  . Equation (20c) then becomes

 

 

 

 

 

(21n)

Now on the left,    and on the right, over each surface element, the unit normal   is uniform so that, by (8p),   .  With these substitutions, the minus signs cancel and we get

 

 

 

 

 

(21g)

or, if we write    and   

 

 

 

 

 

(21r)

This result, although well attested in the literature,[11] does not seem to have a name—unlike the next result.

Special case for the curl

edit

In (20d), the 2D condition is satisfied if q is replaced by    because then (again) the integrand on the left is zero on the broad faces of S , where n is parallel to  . Equation (20d) then becomes

 

 

 

 

 

(22n)

Now on the left, the integrand can be written    and on the right,    by identity (8c), since   is uniform over each surface element.  With these substitutions, the minus signs cancel and we get

 

 

 

 

 

(22c)

or, if we again write    and   

 

 

 

 

 

(22r)

This result—the best-known theorem relating an integral over a surface segment to an integral around its enclosing curve, and the best-known theorem involving the curl—is called Stokes' theorem or, more properly, the Kelvin–Stokes theorem,[12] or simply the curl theorem.[o]

The integral on the left of (22c) or (22r) is called the circulation of the vector field q around the closed curve C. So, in words, the Kelvin-Stokes theorem says that the circulation of a vector field around a closed curve is equal to the flux of the curl of that vector field through any surface spanning that closed curve.

Now let a general element of Σ (with area dΣ ) be enclosed by the curve δC, traversed in the same direction as the outer curve C. Then, applying (22c) to the single element, we have

 

that is,

 

 

 

 

 

(23c)

where the right-hand side is simply the circulation per unit area.

Equation (23c) is an alternative definition of the curl: it says that the curl of q is the vector whose component in any direction is the circulation of q per unit area of a surface whose normal points in that direction. For real q, this component has its maximum, namely |curl q| , in the direction of curl q; thus the curl of q is the vector whose direction is that which a surface must face if the circulation of q per unit area of that surface is to be a maximum, and whose magnitude is that maximum. This is the usual conceptual definition of the curl.[13]

[Notice, however, that our original volume-based definition (4c) is more succinct: the curl is the closed-surface circulation per unit volume, i.e. the skew surface integral per unit volume.]

It should now be clear where the curl gets its name (coined by Maxwell), and why it is also called the rotation (indeed the curl operator is sometimes written "rot", especially in Continental languages, in which "rot" does not have the same unfortunate everyday meaning as in English). It should be similarly unsurprising that a vector field with zero curl is described as irrotational (which one must carefully pronounce differently from "irri tational"!), and that the curl of the velocity of a fluid is called the vorticity.

However, a field does not need to be vortex-like in order to have a non-zero curl; for example, by identity (8p), in Cartesian coordinates, the velocity field xj has a curl equal to  x × j = i × j = k ,  although it describes a shearing motion rather than a rotating motion. This is understandable because if you hold a pencil between the palms of your hands and slide one palm over the other (a shearing motion), the pencil rotates. Conversely, we can have a vortex-like field whose curl is zero everywhere except on or near the axis of the vortex. For example, the Maxwell-Ampère law in magnetostatics says that  curl H = J , where H is the magnetizing field and J is the current density.[p] So if the current is confined to a wire, curl H is zero outside the wire—although, as is well known, the field lines circle the wire. The explanation of the paradox is that H gets stronger as we approach the wire, making a shearing pattern, whose effect on the curl counteracts that of the rotation.

The curl-grad and div-curl operators

edit

We have seen from (9L) that the Laplacian of a scalar field is the divergence of the gradient. Four more such second-order combinations make sense, namely the curl of the gradient (of a scalar field), and the divergence of the curl, the gradient of the divergence, and the curl of the curl (of a vector field). The first two —"curl grad" and "div curl"— can now be disposed of.

Let the surface segment Σ enclosed by the curve C be a segment of the closed surface S surrounding the volume V, and let Σ expand across S until it engulfs S , so that C shrinks to a point on the far side of S. Then, in the nameless theorem (21g) and the Kelvin-Stokes theorem (22c), the integral on the left becomes zero while Σ and   on the right become S and n̂ , so that the theorems respectively reduce to

 

and

 

Applying theorem (5c) to the first of these two equations, and the divergence theorem (5d) to the second, we obtain respectively

 

and

 

As the integrals vanish for any volume V in which the integrands are defined, the integrands must be zero wherever they are defined; that is,

 

 

 

 

 

(24c)

and

 

 

 

 

 

(24d)

In words, the curl of the gradient is zero, and the divergence of the curl is zero; or, more concisely, any gradient is irrotational, and any curl is solenoidal.

We might well ask whether the converses are true. Is every irrotational vector field the gradient of something? And is every solenoidal vector field the curl of something? The answers are affirmative, but the proofs require more preparation.

Meanwhile we may note, as a mnemonic aid, that when the left-hand sides of the last two equations are rewritten in the del-cross and del-dot notations, they become  ∇ × ∇p  and  ∇  ∇ × q , respectively. The former looks like (but isn't) a cross-product of two parallel vectors, and the latter looks like (but isn't) a scalar triple product with a repeated factor, so that each expression looks like it ought to be zero (and it is). But such appearances can lead one astray, because is an operator, not a self-contained vector quantity; for example,  p × ∇φ  is not identically zero, because two gradients are not necessarily parallel.[14]

We should also note, to tie a loose end, that identity (24d) was to be expected from our verbal statement of the Kelvin-Stokes theorem (22c). That statement implies that the flux of the curl through any two surfaces spanning the same closed curve is the same. So if we make a closed surface from two spanning surfaces, the flux into one spanning surface is equal to the flux out of the other, i.e. the net flux out of the closed surface is zero, i.e. the integral of the divergence over the enclosed volume is zero; and since any simple volume in which the divergence is defined can be enclosed this way, the divergence itself (of the curl) must be zero wherever it is defined.

Change per unit length

edit

Continuing (and concluding) the trend of reducing the number of dimensions, we now seek one-dimensional theorems, each of which relates an integral over a path to values at the endpoints of the path. For maximum generality, the path should be allowed to be curved into a second and a third dimension.

We could do this by further specializing theorems (5g) to (5L). We could take a curve Γ with a unit tangent vector ŝ. At every point on Γ we could mount a circular disk with a uniform small area α , centered on Γ and orthogonal to it. We could let V be the volume occupied by all the disks and let S be its enclosing surface; thus V would be a thin right circular cylinder, except that its axis could be curved. If we could arrange for α to cancel out, our four theorems would indeed be reduced to the desired form, provided that there were no contribution from the curved face of the "cylinder" to the integral over S (the "1D proviso"). But, as it turns out, this exercise yields only one case in which the "1D proviso" can be satisfied by a construction involving ŝ and a general field, and we have already almost discovered that case by a simpler argument—which we shall now continue.

Fundamental theorem

edit

Equation (9g) is applicable where p(r) is a scalar field,  s is a parameter measuring arc length along a curve Γ, and ŝ is the unit tangent vector to Γ in the direction of increasing s. Let s take the values s1 and s2 at the endpoints of Γ, where the position vector r takes the values r1 and r2 respectively. Then, integrating (9g) w.r.t. s from s1 to s2 and applying the fundamental theorem of calculus, we get

 

 

 

 

 

(25g)

This is our third integral theorem involving the gradient, and the best-known of the three: it is commonly called simply the gradient theorem,[q] or the fundamental theorem of the gradient, or the fundamental theorem of line integrals; it generalizes the fundamental theorem of calculus to a curved path.[15] If we write   for  ŝ ds (the change in the position vector), we get the theorem in the alternative form

 

 

 

 

 

(25r)

As the right-hand side of (25g) or (25r) obviously depends on the endpoints but not on the path in between, so does the integral on the left. This integral is commonly called the work integral of p over the path—because if p is a force, the integral is the work done by the force over the path. So, in words, the gradient theorem says that the change in value of a scalar field from one point to another is the work integral of the gradient of that field field over any path from the one to the other.

Applying (25r) to a single element of the curve, we get

 

 

 

 

 

(26g)

Alternatively, we could have obtained (26g) by multiplying both sides of (9g) by ds, and then obtained (25r) by adding (26g) over all the elemental displacements   on any path from r1 to r2.

If we close the path by setting  r2 = r1 , the gradient theorem reduces to

 

 

 

 

 

(27g)

where the integral is around any closed loop. Applying the Kelvin-Stokes theorem then gives

 

 

 

 

 

(28g)

where Σ is any surface spanning the loop. As this applies to any loop spanned by any surface on which the integrand is defined,  curl ∇p must be zero wherever it is defined. This is a second proof (and indeed the usual method of proof) of theorem (24c).

[To be continued.]

Additional information

edit

Competing interests

edit

None.

Ethics statement

edit

This article does not concern research on human or animal subjects.

TO DO:

edit
  • Keywords
  • Figure(s) & caption(s)
  • Etc.!

Notes

edit
  1. E.g., Feynman (1963, vol. 1, §11-5), having defined velocity from displacement in Cartesian coordinates, shows that velocity is a vector by showing that its coordinate representation contra-rotates (like that of displacement) if the coordinate system rotates.
  2. E.g., Feynman (1963, vol. 1, §11-7), having defined the magnitude and dot-product operators in Cartesian coordinates, shows that they are scalar operators by showing that their representations in rotated coordinates are the same as in the original coordinates (except for names of coordinates and components). And Chen-To Tai (1995, pp. 40–42), having determined the form of the "gradient" operator in a general curvilinear orthogonal coordinate system, shows that it is a vector operator by showing that it has the same form in any other curvilinear orthogonal coordinate system.
  3. Even if we claim that "particles" of matter are wave functions and therefore continuous, this still implies that matter is lumpy in a manner not normally contemplated by continuum mechanics.
  4. If r is the position of a particle and p is its momentum, the last term vanishes. If the force is toward the origin, the previous term also vanishes.
  5. Here we use the broad triangle symbol (△) rather than the narrower Greek Delta (Δ); the latter would more likely be misinterpreted as "change in…"
  6. There is no need for parentheses around ρv , because div ρv cannot mean (div ρ)v , because the divergence of a scalar field is not defined.
  7. The material derivative d/dt is also called the substantive derivative, and is sometimes written D/Dt if the result is meant to be understood as a field rather than simply a function of time (Kemmer, 1977, pp. 184–5).
  8. Or nabla, because it allegedly looks like the ancient Phoenician harp that the Greeks called by that name.
  9. But Gibbs (1881) and Wilson (1907) were content to leave it as .  And they did not call it the Laplacian; they used that term with a different meaning, which has apparently fallen out of fashion.
  10. The common perception that they are valid only in Cartesian coordinates arises chiefly from failure to allow for the variability of the basis vectors in other coordinate systems; cfKemmer, 1977, pp. 163–5, 172–3 (Exs. 2, 3, 5), 230–33 (sol'ns), and Feynman, 1963, vol. 2, §2-8 ("Pitfall number two…").
  11. Stress is a second-order tensor, and the origin of the term "tensor"; but, for present purposes, it's just another possible example of a field called ψ.
  12. Kemmer (1977, p. 98, eq. 4) gives an equivalent result for our first three integral theorems (5g to 5d) only, and calls it the generalized divergence theorem because the divergence theorem is its most familiar special case.
  13. In mathematical jargon, it should be a two-dimensional manifold embedded in 3D Euclidean space.
  14. If any part of our argument requires Σ or C to be smooth, this is not an impediment, because having approximated Σ or C to any desired accuracy by a polyhedron or polygon, we can then approximate the polyhedron or polygon to any desired higher accuracy by a smooth surface or curve!
  15. Although Hsu (1984, p. 141) applies that name to our theorem (5c).
  16. In the general case, there is an extra term D/∂t on the right; but this term is zero in the magnetostatic case.
  17. Although Hsu (1984, p. 141) applies that name to our theorem (5g).

References

edit
  1. Axler, 1995, §9. The relegation of determinants was anticipated by C.G. Broyden (1975). But Broyden's approach is less radical: he does not deal with abstract vector spaces or abstract linear transformations, and his eventual definition of the determinant, unlike Axler's, is traditional—not a product of the preceding narrative.
  2. Axler, 1995, §1. But it is Broyden (1975), not Axler, who discusses numerical methods at length.
  3. There are many proofs and interpretations of this identity. My own effort, for what it's worth, is "Trigonometric proof of vector triple product expansion", Mathematics Stack Exchange, t.co/NM2v4DJJGo, 2024. The classic is Gibbs, 1881, §§ 26–7.
  4. Gibbs, 1881, § 56.
  5. Katz, 1979, pp. 146–9.
  6. In the three-volume Feynman Lectures on Physics (1963),  −∇p as the "pressure force per unit volume" eventually appears in the 3rd-last lecture of Volume 2 (§40-1).
  7. A demonstration like the foregoing is outlined by Gibbs (1881, § 55).
  8. Wilson, 1907, pp. 147–8; Borisenko & Tarapov, 1968, pp. 147–8 (again); Hsu, 1984, p. 92; Kreyszig, 1988, pp. 485–6; Wrede & Spiegel, 2010, p. 198.
  9. Gibbs (1881, § 50) introduces the gradient with this definition, except that he calls u simply the derivative of u, and u the primitive of u. Use of the term gradient as an alternative to derivative is reported by Wilson (1907, p. 138).
  10. CfBorisenko & Tarapov, 1968, p. 157, eq. (4.43), quoted in Tai, 1995, p. 33, eq. (4.19).
  11. E.g., Gibbs, 1884, § 165, eq. (1); Wilson, 1907, p. 255, Ex. 1; Kemmer, 1977, p. 99, eq. (6); Hsu, 1984, p. 146, eq. (7.31).
  12. CfKatz, 1979, pp. 149–50.
  13. E.g., Gibbs 1881, § 61; Hsu, 1984, pp. 117–18.
  14. CfFeynman, 1963, vol. 2, §2-8.
  15. Presumably this is why Gibbs called the gradient simply the derivative (Gibbs, 1881, § 50; cf. §§ 51, 59).

Bibliography

edit
  • S.J. Axler, 1995, "Down with Determinants!"  American Mathematical Monthly, vol. 102, no. 2 (Feb. 1995), pp. 139–54; jstor.org/stable/2975348.  (Author's preprint, with different pagination: researchgate.net/publication/265273063_Down_with_Determinants.)
  • S.J. Axler, 2023–, Linear Algebra Done Right, 4th Ed., Springer; linear.axler.net (open access).
  • A.I. Borisenko and I.E. Tarapov (tr. & ed. R.A. Silverman), 1968, Vector and Tensor Analysis with Applications, Prentice-Hall; reprinted New York: Dover, 1979, archive.org/details/vectortensoranal0000bori.
  • C.G. Broyden, 1975, Basic Matrices, London: Macmillan.
  • R.P. Feynman, R.B. Leighton, & M. Sands, 1963 etc., The Feynman Lectures on Physics, California Institute of Technology; feynmanlectures.caltech.edu.
  • J.W. Gibbs, 1881–84, "Elements of Vector Analysis", privately printed New Haven: Tuttle, Morehouse & Taylor, 1881 (§§ 1–101), 1884 (§§ 102–189, etc.), archive.org/details/elementsvectora00gibb; published in The Scientific Papers of J. Willard Gibbs (ed. H.A. Bumstead & R.G. Van Name), New York: Longmans, Green, & Co., 1906, vol. 2, archive.org/details/scientificpapers02gibbuoft, pp. 17–90.
  • H.P. Hsu, 1984, Applied Vector Analysis, Harcourt Brace Jovanovich; archive.org/details/appliedvectorana00hsuh.
  • V.J. Katz, 1979, "The history of Stokes' theorem", Mathematics Magazine, vol. 52, no. 3 (May 1979), pp. 146–56; jstor.org/stable/2690275.
  • N. Kemmer, 1977, Vector Analysis: A physicist's guide to the mathematics of fields in three dimensions, Cambridge; archive.org/details/isbn_0521211581.
  • E. Kreyszig, 1962 etc., Advanced Engineering Mathematics, New York: Wiley;  5th Ed., 1983;  6th Ed., 1988;  9th Ed., 2006;  10th Ed., 2011.
  • P.H. Moon and D.E. Spencer, 1965, Vectors, Princeton, NJ: Van Nostrand.
  • W.K.H. Panofsky and M. Phillips, 1962, Classical Electricity and Magnetism, 2nd Ed., Addison-Wesley; reprinted Mineola, NY: Dover, 2005.
  • C.-T. Tai, 1990, "Differential operators in vector analysis and the Laplacian of a vector in the curvilinear orthogonal system" (Technical Report RL 859), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/21026.
  • C.-T. Tai, 1994, "A survey of the improper use of ∇ in vector analysis" (Technical Report RL 909), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/7869.
  • C.-T. Tai, 1995, "A historical study of vector analysis" (Technical Report RL 915), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/7868.
  • E.B. Wilson, 1907, Vector Analysis: A text-book for the use of students of mathematics and physics ("Founded upon the lectures of J. Willard Gibbs…"), 2nd Ed., New York: Charles Scribner's Sons; archive.org/details/vectoranalysisa01wilsgoog.
  • R.C. Wrede and M.R. Spiegel, 2010, Advanced Calculus, 3rd Ed., New York: McGraw-Hill (Schaum's Outlines); archive.org/details/schaumsoutlinesa0000wred.