WikiJournal Preprints/Cut the coordinates! (or Vector Analysis Done Fast)

WikiJournal Preprints
Open access • Publication charge free • Public peer review

WikiJournal User Group is a publishing group of open-access, free-to-publish, Wikipedia-integrated academic journals. <seo title=" Wikiversity Journal User Group, WikiJournal Free to publish, Open access, Open-access, Non-profit, online journal, Public peer review "/>

<meta name='citation_doi' value=>

Article information

Abstract

(Partial draft—under construction.)


Introduction edit

Sheldon Axler, in his essay "Down with determinants!" (1995) and his ensuing book Linear Algebra Done Right (4th Ed., 2023–), does not entirely eliminate determinants, but introduces them as late as possible and then exploits them for what he calls their "main reasonable use in undergraduate mathematics", namely the change-of-variables formula for multiple integrals.[1] Here I treat coordinates in vector analysis somewhat as Axler treats determinants in linear algebra: I introduce coordinates as late as possible and then exploit them in unusually rigorous derivations of vector-analytic identities from vector-algebraic identities. The analogy is imperfect in at least two ways. First, as my alternative title suggests, I have no intention of expanding my paper into a book. Brevity is of the essence. Second, my approach does not extend to computation: while one may well avoid determinants in numerical linear algebra,[2] one can hardly avoid coordinates in numerical vector analysis! But if I merely express the operators of vector analysis in a suitably general coordinate system, leaving other authors to specialize it and compute with it, this too will be consistent with brevity.

The cost of coordinates edit

Mathematicians define a "vector" as a member of a vector space, which is a set whose members satisfy certain basic rules of algebra (called the vector-space axioms) with respect to another set (called a field), which has its own basic rules (the field axioms), and whose members are called "scalars". Physicists are more fussy. They typically want a "vector" to be not only a member of a vector space, but also a first-order tensor : a "tensor", meaning that it has an existence independent of any coordinate system with which it might be specified; and "first order" (or "first-degree", or "first-rank"), meaning that it is specified by a one-dimensional array of numbers. Similarly, a 2nd-order tensor is specified by a 2-dimensional array (a matrix), and a 3rd-order by a 3-dimensional array, and so on; and a "scalar", being specified by a single number, i.e. by a zero-dimensional array, is a zero-order tensor. In "vector analysis", we are greatly interested in applications to physical situations, and accordingly take the physicists' view on what constitutes a vector or a scalar.

So, for our purposes, defining a quantity by three components in (say) a Cartesian coordinate system is not enough to make it a vector, and defining a quantity as a real function of the three coordinates is not enough to make it a scalar, because we still need to show that the quantity has an independent existence. One way to do this is to show that its coordinate representation behaves appropriately when the coordinate system is changed. Independent existence of a quantity means that its coordinate representation is contravariant, i.e. that it changes so as to compensate for the change in the coordinate system.[a] Independent existence of an operator means that its coordinate representation is covariant, i.e. that the representation of the operator in the coordinate system, with the operand(s) and the result in that system, retains its defining form as the system changes.[b]

I circumvent these complications by the most obvious route: where possible, I initially define the quantity without coordinates; and if a coordinate-based initial definition is thrust upon me, I promptly seek an equivalent coordinate-free definition. If, having defined a quantity without coordinates, we then need to represent it with coordinates, we can choose the coordinate system for convenience.

The limitations of limits edit

In the branch of pure mathematics known as analysis, there is a thing called a limit, whereby for every positive ϵ there is a positive δ such that if some increment is less than δ, some error is less than ϵ. In the branch of applied mathematics known as continuum mechanics, there is a thing called reality, whereby if the increment is less than some positive δ, the assumption of a continuum becomes ridiculous, so that the error cannot be made less than an arbitrary ϵ. Yet vector "analysis" is typically studied with the intention of applying it to some form of "continuum" mechanics, such as the modeling of elasticity, plasticity, fluid flow, or (widening the net) electrodynamics of ordinary matter; in short, it is studied with the intention of conveniently forgetting that, on a sufficiently small scale, matter is lumpy.[c] One might therefore submit that to express the principles of vector analysis in the language of limits is to strain at a gnat and swallow a camel. I avoid that camel by referring to elements of length or area or volume, each of which is small enough to allow some quantity or quantities to be considered uniform within it, but, for the same reason, large enough to allow such local averaging of the said quantity or quantities as is necessary to tune out the lumpiness.

We shall see bigger camels, where well-known authors define or misdefine a vector operator and then want to treat it like an ordinary vector. These I also avoid.

Prerequisites edit

I assume that the reader is familiar with the algebra and geometry of vectors in 3D space, including the dot-product, the cross-product, and the scalar triple product, their geometric meanings, their expressions in Cartesian coordinates, and the identity

a × (b × c)  =  a⸱ c ba⸱b c ,

which we call the "expansion" of the vector triple product.[d] I further assume that the reader can generalize the concept of a derivative, so as to differentiate a vector with respect to a scalar, e.g.

 

or so as to differentiate a function of several independent variables "partially" w.r.t. one of them, e.g.

 

Note that if x, y, and z are coordinates, they are not strictly scalars, because they are not independent of the coordinate system! Similarly, the numeric components of a vector in the coordinate directions are not strictly scalars,[3] and the derivatives of a scalar or vector w.r.t. the coordinates are not strictly scalars or vectors; but components of a vector in coordinate-independent directions, and derivatives of a scalar or vector w.r.t. coordinate-independent real variables, are indeed true scalars or vectors.

In view of the above remarks on limits, I also expect the reader to be tolerant of an argument like this: In time dt, let the vectors r and p change by   and  , respectively. Then

 

(where, as always, the orders of the cross-products matter).[e] Or like this: If  , then

 

that is, we can switch the order of partial differentiation. If x is an abbreviation for /∂x, etc., this rule can be written in operational terms as

x y = ∂y x .

More generally, if i is an abbreviation for /∂xi where i = 1, 2,…, the rule becomes

i j = ∂j i .

These generalizations of differentiation, however, do not go beyond derivatives w.r.t. real variables, some of which are scalars. Vector analysis is about quantities that may be loosely described as derivatives w.r.t. a vector—usually the position vector.

Closed-surface integrals per unit volume edit

In this section I introduce four quantities—the gradient, the curl, the divergence, and the Laplacian—in a way that will seem unremarkable to those readers who aren't already familiar with them, but strange to those who are. The gradient is commonly introduced in connection with a curve and its endpoints, the curl in connection with a surface segment and its enclosing curve, the divergence in connection with a volume and its enclosing surface, and the Laplacian as a composite of two of the above. Here I introduce all four in connection with a volume and its enclosing surface; and I introduce the Laplacian as a concept in its own right, only later relating it to the others. My initial definitions of the gradient, the curl, and the Laplacian, although not at all novel, are usually thought to be more advanced than the common ones—in spite of being conceptually simpler, and in spite of being obvious variations on the same theme.

Instant integral theorems—with a caveat edit

The term field, mentioned above in the context of algebraic axioms, has an alternative meaning: if r is the position vector, a scalar field is a scalar-valued function of r, and a vector field is a vector-valued function of r; both may also depend on time. Let V be a volume (3D region) enclosed by a surface S. Let n̂ be the unit normal vector at S, pointing out of V. Let n be the distance from S in the direction of n̂ (positive outside V, negative inside), and let n be an abbreviation for /∂n, with a tacit acknowledgment that the derivative—commonly called the normal derivative—is simply assumed to exist.

In V, and on S, let p be a scalar field (e.g., pressure in a fluid, or temperature), and let q be a vector field (e.g., flow velocity, or heat-flow density), and let ψ be a generic field which may be a scalar or a vector. Let a general element of the surface S have area dS, and let it be small enough to allow n̂, p, q, and n ψ to be considered uniform over the element. Then, for every element, the following four products are well defined:

 

 

 

 

 

(1)

If p is pressure in a non-viscous fluid, the first of these products is the force exerted by the fluid in V  through the area dS. The second product does not have such an obvious physical interpretation; but if q is circulating clockwise about an axis directed through V, the cross-product will be exactly tangential to S and will tend to have a component in the direction of that axis. The third product is the flux of q through the surface element; if q is flow velocity, the third product is the volumetric flow rate (volume per unit time) out of V  through dS. The fourth product, by analogy with the third, might be called the flux of the normal derivative of ψ through the surface element, but is equally well defined whether ψ is a scalar or a vector.

If we add up each of the four products over all the elements of the surface S, we obtain, respectively, the four surface integrals

 

 

 

 

 

(2)

in which the double integral sign indicates that the range of integration is two-dimensional. The first integral takes a scalar field and yields a vector; the second takes a vector field and yields a vector; the third takes a vector field and yields a scalar; and the fourth takes a scalar field and yields a scalar, or takes a vector field and yields a vector. If p is pressure in a non-viscous fluid, the first integral is the force exerted by the fluid in V  on the fluid outside V. The second integral may be called the skew surface integral of q over S ,[4] or, for the reason hinted above, the circulation of q over S.  The third integral, commonly called the flux integral (or simply the surface integral) of q over S, is the total flux of q out of V. And the fourth integral is the surface integral of the outward normal derivative of ψ.

Let the volume V  be divided into elements. Let a general volume element have the volume dV and be enclosed by the surface δS —not to be confused with the area dS of a surface element, which may be an element of S or of δS. Now consider what happens if, instead of evaluating each of the above surface integrals over S, we evaluate it over each δS and add up the results for all the volume elements. In the interior of V, each surface element of area dS is on the boundary between two volume elements, for which the unit normals n̂ at dS, and the respective values of n ψ, are equal and opposite. Hence when we add up the integrals over the surfaces δS, the contributions from the elements dS cancel in pairs, except on the original surface S, so that we are left with the original integral over S. Thus, for the four surface integrals in (2), we have respectively

 

 

 

 

 

(3)

Now comes a big "if":  if  we define the gradient of p (pronounced "grad p") as

 

 

 

 

 

(4g)

and the curl of q as

 

 

 

 

 

(4c)

and the divergence of q as

 

 

 

 

 

(4d)

and the Laplacian of ψ as [f]

 

 

 

 

 

(4L)

(where the letters after the equation number stand for gradient, curl, divergence, and Laplacian, respectively), then equations (3) can be rewritten

 

But because each term in each sum has a factor dV, we call the sum an integral; and because the range of integration is three-dimensional, we use a triple integral sign. Thus we obtain the following four theorems relating integrals over an enclosing surface to integrals over the enclosed volume:

 

 

 

 

 

(5g)

 

 

 

 

 

(5c)

 

 

 

 

 

(5d)

 

 

 

 

 

(5L)

Of the above four results, only the third (5d) seems to have a standard name; it is called the divergence theorem (or Gauss's theorem or, more properly, Ostrogradsky's theorem[5]), and is indeed the best known of the four—although the other three, having been derived in parallel with it, may be said to stand on similar foundations.

As each of the operators , curl, and div calls for an integration w.r.t. area and then a division by volume, the dimension (or unit of measurement) of the result is the dimension of the operand divided by the dimension of length, as if the operation were some sort of differentiation w.r.t. position. Moreover, in each of equations (5g) to (5d), there is a triple integral on the right but only a double integral on the left, so that each of the operators , curl, and div appears to compensate for a single integration. For these reasons, and for convenience, we shall describe them as differential operators. By comparison, the  operator in (4L) or (5L) calls for a further differentiation w.r.t. n ; we shall therefore describe as a 2nd-order differential operator. (Another reason for these descriptions will emerge in due course.) As advertised, the four definitions (4g) to (4L) are "obvious variations on the same theme" (although the fourth is somewhat less obvious than the others).

But remember the "if": Theorems (5g) to (5L) depend on definitions (4g) to (4L) and are therefore only as definite as those definitions. Equations (3), without assuming anything about the shapes and sizes of the closed surfaces δS (except, tacitly, that n̂ is piecewise well-defined), indicate that the surface integrals are additive with respect to volume. But this additivity, by itself, does not guarantee that the surface integrals are shared among neighboring volume elements in proportion to their volumes, as envisaged by "definitions" (4g) to (4L). Each of these "definitions" is unambiguous if, and only if, the ratio of the surface integral to dV is insensitive to the shape and size of δS for a sufficiently small δS. Notice that the issue here is not whether the ratios specified in equations (4g) to (4L) are true vectors or scalars, independent of the coordinates; all of the operations needed in those equations have coordinate-free definitions. Rather, the issue is whether the resulting ratios are unambiguous notwithstanding the ambiguity of δS, provided only that δS is sufficiently small. That is the advertised "caveat", which must now be addressed.

Unambiguity of gradient, divergence, and curl edit

In accordance with our "applied" mathematical purpose, we shall establish the unambiguity of a differential operator by a kind of thought experiment in which we apply the operator to a physical field, say f, and find that it yields another physical field whose unambiguity is beyond dispute. The conclusion is then applicable to any operand field whose mathematical properties are consistent with its interpretation as the physical field f; the loss of generality, if any, is only what is incurred by that interpretation.

Suppose that a fluid with density ρ flows with velocity v under the sole influence of the internal pressure p. Then the integral in (4g) is the force exerted by the fluid inside δS on the fluid outside, so that minus the integral is the force exerted on the fluid inside δS. Dividing by dV, we find that −∇p, as defined by (4g), is the force per unit volume,[6] which is the acceleration times the mass per unit volume; that is,

 

Now provided that the left side of this equation is locally continuous, it can be considered uniform inside the small δS, so that the left side is unambiguous, whence p is also unambiguous. If there are additional forces on the fluid element, e.g. due to gravity and/or viscosity, then −∇p is not the sole contribution to density-times-acceleration, but is still the contribution due to pressure, which is still unambiguous.

By showing the unambiguity of definition (4g), we have confirmed theorem (5g). In the process we have seen that the volume-based definition of the gradient is useful for the modeling of fluids, and intuitive in matching the common notion that a pressure "gradient" gives rise to a force. We shall see, however, that the unambiguity of the divergence is even more useful, because it implies the unambiguity of both the gradient and the curl.

In the aforesaid fluid, in time dt, the volume that flows out of δS  through the surface element of area dS  is vdt⸱ n̂ dS.  Multiplying by density and integrating over δS, we find that the mass flowing out of δS  in time dt is   .  Dividing this by dV, and then by dt, we get the rate of reduction of density inside δS ; that is,

 

If the right-hand side is locally continuous, it can be considered uniform inside δS and therefore unambiguous, so that the left side is also unambiguous. But the left side is simply div ρv, as defined by (4d), which is therefore also unambiguous,[7] confirming theorem (5d). In short, the divergence operator is that which maps ρv to  :

 

This result is the so-called equation of continuity, in a form expressing conservation of mass. It shows—as we would expect from the everyday meaning of the word—that the divergence of ρv is positive if the fluid is expanding (ρ decreasing), negative if it is contracting (ρ increasing), and zero if it is incompressible. In the last case, the density ρ is uniform and may therefore be taken outside the integral (hence outside the div operator), so that the equation of continuity reduces to

div v = 0     (for an incompressible fluid).

So a vector field whose divergence is zero may represent the flow velocity of an incompressible fluid. In such a velocity field, any tubular surface tangential to the flow velocity, and consequently with no flow in or out of the "tube", has the same volumetric flow rate across all cross-sections of the "tube", as if the surface were the wall of a pipe full of liquid. Accordingly, a vector field with zero divergence is described as solenoidal (from the Greek word for "pipe"). More generally, a solenoidal vector field has the property that for any tubular surface tangential to the field, the flux integrals across any two cross-sections of the "tube" are the same, because otherwise there would be a net flux integral out of the closed surface comprising the two cross-sections and the segment of tube between them—in which case, by the divergence theorem (5d), the divergence could not be zero everywhere inside.

To see how the unambiguity of the divergence (4d) implies the unambiguity of the curl (4c), we start with (4c) and take dot-products with an arbitrary constant vector b, obtaining

 

that is, by (4d),

 

This is an identity for any constant vector b. In the special case in which b is a unit vector, the left side of the identity is the (scalar) component of curl q in the direction of b, and the right side is unambiguous. Thus the curl is unambiguous because its component in any direction is unambiguous. This confirms theorem (5c).

Similarly, to see how the unambiguity of the divergence implies the unambiguity of the gradient, we start with (4c) and take dot-products with an arbitrary constant vector b, eventually obtaining

 

In the special case in which b is a unit vector, this result says that the (scalar) component of p in the direction of b is given by the right-hand side, which again is unambiguous. Thus we have a second explanation of the unambiguity of the gradient: the gradient is unambiguous because its component in any direction is unambiguous.

Let us now summarize the definitions that have been found unambiguous. If n̂ is the outward unit normal from the surface of the volume element, then

  • the gradient of p is the surface integral of  n̂ p per unit volume;
  • the divergence is the outward flux per unit volume; and
  • the curl is the skew surface integral per unit volume, or the surface circulation per unit volume.

The gradient maps a scalar field to a vector field; the divergence maps a vector field to a scalar field; and the curl maps a vector field to a vector field.

The Laplacian is the surface integral of the outward normal derivative, per unit volume, and maps a scalar field to a scalar field, or a vector field to a vector field; but we have not yet shown that it is unambiguous. To do this, and thence to confirm theorem (5L), we need some further preparation, including an alternative definition of the gradient. In the mean time, we can further unify some of the above results.

The del, del-cross, and del-dot operators edit

The gradient operator is also called del.[g] If it simply denotes the gradient, we tend to pronounce it "grad" in order to emphasize the result. But it can also appear in combination with various other operators, giving various other results; and in those contexts we tend to pronounce it "del".

If our general definition of the gradient (4g) is also taken as the general definition of the operator,[8] then, comparing (4g) with (4c) and (4d), we see that

 

and

 

where the parentheses seem to be required on account of the closing dS in (4g). But if we write the factor dS before the integrand, the del operator in (4g) becomes

 

if  we insist that it is to be read as a operator looking for an operand, and not as a self-contained expression. Then, if we similarly bring forward the dS in (4c) and (4d), the curl and divergence operators become

 

and

 

and each requires a vector field to operate on. Because these operational equivalences follow from our original and most general definitions, there must be a sense in which they are always valid. That does not mean that they are always convenient, or conducive to the avoidance of error! But we shall revisit these issues later; for the moment, we merely observe that the "del cross" and "del dot" notations allow us to condense our first three integral theorems (4g to 4d) into the single operational equation

 

 

 

 

 

(5∇ )

where   is a generic binary operator which may be replaced by a null (for multiplication by a scalar) or a cross or a dot. This result is a generalized volume-integral theorem, relating an integral over a volume to an integral over its enclosing surface.[h]

[To be continued.]

Additional information edit

Acknowledgments edit

Competing interests edit

None.

Ethics statement edit

This article does not concern research on human or animal subjects.

TO DO: edit

  • Abstract
  • Keywords
  • Figure(s) & caption(s)
  • Etc.!

Notes edit

  1. E.g., Feynman (1963, vol. 1, §11-5), having defined velocity from displacement in Cartesian coordinates, shows that velocity is a vector by showing that its coordinate representation contra-rotates (like that of displacement) if the coordinate system rotates.
  2. E.g., Feynman (1963, vol. 1, §11-7), having defined the magnitude and dot-product operators in Cartesian coordinates, shows that they are scalar operators by showing that their representations in rotated coordinates are the same (except for names of coordinates and components) as in the original coordinates. And Chen-To Tai (1995, pp. 40–42), having determined the form of the "gradient" operator in a general curvilinear orthogonal coordinate system, shows that it is a vector operator by showing that it has the same form in any other curvilinear orthogonal coordinate system.
  3. Even if we claim that "particles" of matter are wave functions and therefore continuous, this still implies that matter is lumpy in a manner not normally contemplated by continuum mechanics.
  4. There are many proofs and interpretations of this identity. My own effort, for what it's worth, is at math.stackexchange.com/a/4839213/307861.
  5. If r is the position of a particle and p is its momentum, the last term vanishes. If the force is toward the origin, the previous term also vanishes.
  6. Here I use the broad triangle symbol (△) rather than the narrower Greek Delta (Δ); the latter would more likely be misinterpreted as "change in…"
  7. Or nabla, because it allegedly looks like the ancient Phoenician harp that the Greeks called by that name.
  8. Kemmer (1977, p. 98) calls this result the generalized divergence theorem because the divergence theorem is its most familiar special case.

References edit

  1. Axler, 1995, §9. The relegation of determinants was anticipated by C.G. Broyden (1975). But Broyden's approach is less radical: he does not deal with abstract vector spaces or abstract linear transformations, and his eventual definition of the determinant, unlike Axler's, is traditional—not a product of the preceding narrative.
  2. Axler, 1995, §1. But it is Broyden (1975), not Axler, who discusses numerical methods at length.
  3. CfKemmer, 1977, p. 7.
  4. Gibbs, 1881, § 56.
  5. Katz, 1979, pp. 146–9.
  6. In the three-volume Feynman Lectures on Physics (1963),  −∇p as the "pressure force per unit volume" eventually appears in the 3rd-last lecture of Volume 2 (§40-1).
  7. A demonstration like the foregoing is outlined by Gibbs (1881, § 55).
  8. CfBorisenko & Tarapov, 1968, p. 157, eq. (4.43), quoted in Tai, 1995, p. 33, eq. (4.19).

Bibliography edit

  • S.J. Axler, 1995, "Down with Determinants!"  American Mathematical Monthly, vol. 102, no. 2 (Feb. 1995), pp. 139–54; jstor.org/stable/2975348.  (Author's preprint, with different pagination: researchgate.net/publication/265273063_Down_with_Determinants.)
  • S.J. Axler, 2023–, Linear Algebra Done Right, 4th Ed., Springer; linear.axler.net (open access).
  • A.I. Borisenko and I.E. Tarapov (tr. & ed. R.A. Silverman), 1968, Vector and Tensor Analysis with Applications, Prentice-Hall; reprinted New York: Dover, 1979, archive.org/details/vectortensoranal0000bori.
  • C.G. Broyden, 1975, Basic Matrices, London: Macmillan.
  • R.P. Feynman, R.B. Leighton, & M. Sands, 1963 etc., The Feynman Lectures on Physics, California Institute of Technology; feynmanlectures.caltech.edu.
  • J.W. Gibbs, 1881–84, "Elements of Vector Analysis", privately printed New Haven: Tuttle, Morehouse & Taylor, 1881 (§§ 1–101), 1884 (§§ 102–189, etc.), archive.org/details/elementsvectora00gibb; published in The Scientific Papers of J. Willard Gibbs (ed. H.A. Bumstead & R.G. Van Name), New York: Longmans, Green, & Co., 1906, vol. 2, archive.org/details/scientificpapers02gibbuoft, pp. 17–90.
  • V.J. Katz, 1979, "The history of Stokes' theorem", Mathematics Magazine, vol. 52, no. 3 (May 1979), pp. 146–56; jstor.org/stable/2690275.
  • N. Kemmer, 1977, Vector Analysis: A physicist's guide to the mathematics of fields in three dimensions, Cambridge; archive.org/details/isbn_0521211581.
  • E. Kreyszig, 1962 etc., Advanced Engineering Mathematics, New York: Wiley;  5th Ed., 1983;  6th Ed., 1988;  9th Ed., 2006;  10th Ed., 2011.
  • P.H. Moon and D.E. Spencer, 1965, Vectors, Princeton, NJ: Van Nostrand.
  • W.K.H. Panofsky and M. Phillips, 1962, Classical Electricity and Magnetism, 2nd Ed., Addison-Wesley; reprinted Mineola, NY: Dover, 2005.
  • C.-T. Tai, 1990, "Differential operators in vector analysis and the Laplacian of a vector in the curvilinear orthogonal system" (Technical Report RL 859), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/21026.
  • C.-T. Tai, 1994, "A survey of the improper use of ∇ in vector analysis" (Technical Report RL 909), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/7869.
  • C.-T. Tai, 1995, "A historical study of vector analysis" (Technical Report RL 915), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/7868.
  • E.B. Wilson, 1907, Vector Analysis: A text-book for the use of students of mathematics and physics ("Founded upon the lectures of J. Willard Gibbs…"), 2nd Ed., New York: Charles Scribner's Sons; archive.org/details/vectoranalysisa01wilsgoog.