A sound understanding of tensors and tensor operation is essential if you want to read and understand modern papers on solid mechanics and finite element modeling of complex material behavior. This brief introduction gives you an overview of tensors and tensor notation. For more details you can read A Brief on Tensor Analysis by J. G. Simmonds, the appendix on vector and tensor notation from Dynamics of Polymeric Liquids - Volume 1 by R. B. Bird, R. C. Armstrong, and O. Hassager, and the monograph by R. M. Brannon . An introduction to tensors in continuum mechanics can be found in An Introduction to Continuum Mechanics by M. E. Gurtin. Most of the material in this page is based on these sources.
The following notation is usually used in the literature:
s
=
scalar (lightface italic small)
v
=
vector (boldface roman small)
σ
=
second-order tensor (boldface Greek)
A
=
third-order tensor (boldface italic capital)
A
=
fourth-order tensor (sans-serif capital)
{\displaystyle {\begin{aligned}s&=~{\text{scalar (lightface italic small)}}\\\mathbf {v} &=~{\text{vector (boldface roman small)}}\\{\boldsymbol {\sigma }}&=~{\text{second-order tensor (boldface Greek)}}\\{\boldsymbol {A}}&=~{\text{third-order tensor (boldface italic capital)}}\\{\boldsymbol {\mathsf {A}}}&=~{\text{fourth-order tensor (sans-serif capital)}}\end{aligned}}}
A force
f
{\displaystyle \mathbf {f} \,}
has a magnitude and a direction, can be added to another force, be multiplied by a scalar and so on. These properties make the force
f
{\displaystyle \mathbf {f} \,}
a vector .
Similarly, the displacement
u
{\displaystyle \mathbf {u} }
is a vector because it can be added to other displacements and satisfies the other properties of a vector.
However, a force cannot be added to a displacement to yield a physically meaningful quantity. So the physical spaces that these two quantities lie on must be different.
Recall that a constant force
f
{\displaystyle \mathbf {f} }
moving through a displacement
u
{\displaystyle \mathbf {u} \,}
does
f
∙
u
{\displaystyle \mathbf {f} \bullet \mathbf {u} }
units of work. How do we compute this product when the spaces of
f
{\displaystyle \mathbf {f} \,}
and
u
{\displaystyle \mathbf {u} \,}
are different? If you try to compute the product on a graph, you will have to convert both quantities to a single basis and then compute the scalar product.
An alternative way of thinking about the operation
f
∙
u
{\displaystyle \mathbf {f} \bullet \mathbf {u} }
is to think of
f
{\displaystyle \mathbf {f} \,}
as a linear operator that acts on
u
{\displaystyle \mathbf {u} }
to produce a scalar quantity (work). In the notation of sets we can write
f
∙
u
≡
f
:
u
→
R
.
{\displaystyle \mathbf {f} \bullet \mathbf {u} ~~~\equiv ~~~\mathbf {f} :\mathbf {u} \rightarrow \mathbb {R} ^{}~.}
A first order tensor is a linear operator that sends vectors to scalars.
Next, assume that the force
f
{\displaystyle \mathbf {f} \,}
acts at a point
x
{\displaystyle \mathbf {x} \,}
. The moment of the force about the origin is given by
x
×
f
{\displaystyle \mathbf {x} \times \mathbf {f} \,}
which is a vector . The vector product can be thought of as an linear operation too. In this case the effect of the operator is to convert a vector into another vector .
A second order tensor is a linear operator that sends vectors to vectors.
According to Simmonds, "the name tensor comes from elasticity theory where in a loaded elastic body the stress tensor acting on a unit vector normal to a plane through a point delivers the tension (i.e., the force per unit area) acting across the plane at that point."
Examples of second order tensors are the stress tensor, the deformation gradient tensor, the velocity gradient tensor, and so on.
Another type of tensor that we encounter frequently in mechanics is the fourth order tensor that takes strains to stresses. In elasticity, this is the stiffness tensor.
A fourth order tensor is a linear operator that sends second order tensors to second order tensors.
A tensor
A
{\displaystyle {\boldsymbol {A}}\,}
is a linear transformation from a vector space
V
{\displaystyle {\mathcal {V}}}
to
V
{\displaystyle {\mathcal {V}}}
. Thus, we can write
A
:
u
∈
V
→
v
∈
V
.
{\displaystyle {\boldsymbol {A}}:\mathbf {u} \in {\mathcal {V}}\rightarrow \mathbf {v\in {\mathcal {V}}} ~.}
More often, we use the following notation:
v
=
A
u
≡
A
(
u
)
≡
A
∙
u
.
{\displaystyle \mathbf {v} ={\boldsymbol {A}}\mathbf {u} \equiv {\boldsymbol {A}}(\mathbf {u} )\equiv {\boldsymbol {A}}\bullet \mathbf {u} ~.}
I have used the "dot" notation in this handout. None of the above notations is obviously superior to the others and each is used widely.
Let
A
{\displaystyle {\boldsymbol {A}}\,}
and
B
{\displaystyle {\boldsymbol {B}}\,}
be two tensors. Then the sum
(
A
+
B
)
{\displaystyle ({\boldsymbol {A}}+{\boldsymbol {B}})\,}
is another tensor
C
{\displaystyle {\boldsymbol {C}}\,}
defined by
C
=
A
+
B
⟹
C
∙
v
=
(
A
+
B
)
∙
v
=
A
∙
v
+
B
∙
v
.
{\displaystyle {\boldsymbol {C}}={\boldsymbol {A}}+{\boldsymbol {B}}\implies {\boldsymbol {C}}\bullet \mathbf {v} =({\boldsymbol {A}}+{\boldsymbol {B}})\bullet \mathbf {v} ={\boldsymbol {A}}\bullet \mathbf {v} +{\boldsymbol {B}}\bullet \mathbf {v} ~.}
Multiplication of a tensor by a scalar
edit
Let
A
{\displaystyle {\boldsymbol {A}}\,}
be a tensor and let
λ
{\displaystyle \lambda \,}
be a scalar. Then the product
C
=
λ
A
{\displaystyle {\boldsymbol {C}}=\lambda {\boldsymbol {A}}\,}
is a tensor defined by
C
=
λ
A
⟹
C
∙
v
=
(
λ
A
)
∙
v
=
λ
(
A
∙
v
)
.
{\displaystyle {\boldsymbol {C}}=\lambda {\boldsymbol {A}}\implies {\boldsymbol {C}}\bullet \mathbf {v} =(\lambda {\boldsymbol {A}})\bullet \mathbf {v} =\lambda ({\boldsymbol {A}}\bullet \mathbf {v} )~.}
The zero tensor
0
{\displaystyle {\boldsymbol {\mathit {0}}}\,}
is the tensor which maps every vector
v
{\displaystyle \mathbf {v} \,}
into the zero vector.
0
∙
v
=
0
.
{\displaystyle {\boldsymbol {\mathit {0}}}\bullet \mathbf {v} =\mathbf {0} ~.}
The identity tensor
I
{\displaystyle {\boldsymbol {\mathit {I}}}\,}
takes every vector
v
{\displaystyle \mathbf {v} \,}
into itself.
I
∙
v
=
v
.
{\displaystyle {\boldsymbol {\mathit {I}}}\bullet \mathbf {v} =\mathbf {v} ~.}
The identity tensor is also often written as
1
{\displaystyle {\boldsymbol {\mathit {1}}}\,}
.
Product of two tensors
edit
Let
A
{\displaystyle {\boldsymbol {A}}\,}
and
B
{\displaystyle {\boldsymbol {B}}\,}
be two tensors. Then the product
C
=
A
∙
B
{\displaystyle {\boldsymbol {C}}={\boldsymbol {A}}\bullet {\boldsymbol {B}}}
is the tensor that is defined by
C
=
A
∙
B
⟹
C
∙
v
=
(
A
∙
B
)
∙
v
=
A
∙
(
B
∙
v
)
.
{\displaystyle {\boldsymbol {C}}={\boldsymbol {A}}\bullet {\boldsymbol {B}}\implies {\boldsymbol {C}}\bullet \mathbf {v} =({\boldsymbol {A}}\bullet {\boldsymbol {B}})\bullet {\mathbf {v} }={\boldsymbol {A}}\bullet ({\boldsymbol {B}}\bullet {\mathbf {v} })~.}
In general
A
∙
B
≠
B
∙
A
{\displaystyle {\boldsymbol {A}}\bullet {\boldsymbol {B}}\neq {\boldsymbol {B}}\bullet {\boldsymbol {A}}}
.
Transpose of a tensor
edit
The transpose of a tensor
A
{\displaystyle {\boldsymbol {A}}\,}
is the unique tensor
A
T
{\displaystyle {\boldsymbol {A}}^{T}\,}
defined by
(
A
∙
u
)
∙
v
=
u
∙
(
A
T
∙
v
)
.
{\displaystyle ({\boldsymbol {A}}\bullet \mathbf {u} )\bullet \mathbf {v} =\mathbf {u} \bullet ({\boldsymbol {A}}^{T}\bullet \mathbf {v} )~.}
The following identities follow from the above definition:
(
A
+
B
)
T
=
A
T
+
B
T
,
(
A
∙
B
)
T
=
B
T
∙
A
T
,
(
A
T
)
T
=
A
.
{\displaystyle {\begin{aligned}({\boldsymbol {A}}+{\boldsymbol {B}})^{T}&={\boldsymbol {A}}^{T}+{\boldsymbol {B}}^{T}~,\\({\boldsymbol {A}}\bullet {\boldsymbol {B}})^{T}&={\boldsymbol {B}}^{T}\bullet {\boldsymbol {A}}^{T}~,\\({\boldsymbol {A}}^{T})^{T}&={\boldsymbol {A}}~.\end{aligned}}}
Symmetric and skew tensors
edit
A tensor
A
{\displaystyle {\boldsymbol {A}}\,}
is symmetric if
A
=
A
T
.
{\displaystyle {\boldsymbol {A}}={\boldsymbol {A}}^{T}~.}
A tensor
A
{\displaystyle {\boldsymbol {A}}\,}
is skew if
A
=
−
A
T
.
{\displaystyle {\boldsymbol {A}}=-{\boldsymbol {A}}^{T}~.}
Every tensor
A
{\displaystyle {\boldsymbol {A}}\,}
can be expressed uniquely as the sum of a symmetric tensor
E
{\displaystyle {\boldsymbol {E}}\,}
(the symmetric part of
A
{\displaystyle {\boldsymbol {A}}\,}
) and a skew tensor
W
{\displaystyle {\boldsymbol {W}}\,}
(the skew part of
A
{\displaystyle {\boldsymbol {A}}\,}
).
A
=
E
+
W
;
E
=
A
+
A
T
2
;
W
=
A
−
A
T
2
.
{\displaystyle {\boldsymbol {A}}={\boldsymbol {E}}+{\boldsymbol {W}}~;~~{\boldsymbol {E}}={\cfrac {{\boldsymbol {A}}+{\boldsymbol {A}}^{T}}{2}}~;~~{\boldsymbol {W}}={\cfrac {{\boldsymbol {A}}-{\boldsymbol {A}}^{T}}{2}}~.}
Tensor product of two vectors
edit
The tensor (or dyadic) product
a
b
{\displaystyle \mathbf {a} \mathbf {b} \,}
(also written
a
⊗
b
{\displaystyle \mathbf {a} \otimes \mathbf {b} \,}
) of two vectors
a
{\displaystyle \mathbf {a} \,}
and
b
{\displaystyle \mathbf {b} \,}
is a tensor that assigns to each vector
v
{\displaystyle \mathbf {v} \,}
the vector
(
b
∙
v
)
a
{\displaystyle (\mathbf {b} \bullet \mathbf {v} )\mathbf {a} }
.
(
a
b
)
∙
v
=
(
a
⊗
b
)
∙
v
=
(
b
∙
v
)
a
.
{\displaystyle (\mathbf {a} \mathbf {b} )\bullet \mathbf {v} =(\mathbf {a} \otimes \mathbf {b} )\bullet \mathbf {v} =(\mathbf {b} \bullet \mathbf {v} )\mathbf {a} ~.}
Notice that all the above operations on tensors are remarkably similar to matrix operations.
The spectral theorem for tensors is widely used in mechanics. We will start off by definining eigenvalues and eigenvectors.
Let
S
{\displaystyle {\boldsymbol {S}}}
be a second order tensor. Let
λ
{\displaystyle \lambda }
be a scalar and
n
{\displaystyle \mathbf {n} }
be a vector such that
S
⋅
n
=
λ
n
{\displaystyle {\boldsymbol {S}}\cdot \mathbf {n} =\lambda ~\mathbf {n} }
Then
λ
{\displaystyle \lambda }
is called an eigenvalue of
S
{\displaystyle {\boldsymbol {S}}}
and
n
{\displaystyle \mathbf {n} }
is an eigenvector .
A second order tensor has three eigenvalues and three eigenvectors, since the space is three-dimensional. Some of the eigenvalues might be repeated. The number of times an eigenvalue is repeated is called multiplicity .
In mechanics, many second order tensors are symmetric and positive definite. Note the following important properties of such tensors:
If
S
{\displaystyle {\boldsymbol {S}}}
is positive definite, then
λ
>
0
{\displaystyle \lambda >0}
.
If
S
{\displaystyle {\boldsymbol {S}}}
is symmetric, the eigenvectors
n
{\displaystyle \mathbf {n} }
are mutually orthogonal.
For more on eigenvalues and eigenvectors see Applied linear operators and spectral methods .
Polar decomposition theorem
edit
Let
F
{\displaystyle {\boldsymbol {F}}}
be second order tensor with
det
F
>
0
{\displaystyle \det {\boldsymbol {F}}>0}
. Then
there exist positive definite, symmetric tensors
U
{\displaystyle {\boldsymbol {U}}}
,
V
{\displaystyle {\boldsymbol {V}}}
and a rotation (orthogonal) tensor
R
{\displaystyle {\boldsymbol {R}}}
such that
F
=
R
⋅
U
=
V
⋅
R
{\displaystyle {\boldsymbol {F}}={\boldsymbol {R}}\cdot {\boldsymbol {U}}={\boldsymbol {V}}\cdot {\boldsymbol {R}}}
.
also each of these decompositions is unique .
Principal invariants of a tensor
edit
Let
S
{\displaystyle {\boldsymbol {S}}}
be a second order tensor. Then the determinant of
S
−
λ
I
{\displaystyle {\boldsymbol {S}}-\lambda ~{\boldsymbol {\mathit {I}}}}
can be expressed as
det
(
S
−
λ
I
)
=
−
λ
3
+
I
1
(
S
)
λ
2
−
I
2
(
S
)
λ
+
I
3
(
S
)
{\displaystyle \det({\boldsymbol {S}}-\lambda ~{\boldsymbol {\mathit {I}}})=-\lambda ^{3}+I_{1}({\boldsymbol {S}})~\lambda ^{2}-I_{2}({\boldsymbol {S}})~\lambda +I_{3}({\boldsymbol {S}})}
The quantities
I
1
,
I
2
,
I
3
{\displaystyle I_{1},I_{2},I_{3}\,}
are called the principal invariants of
S
{\displaystyle {\boldsymbol {S}}}
. Expressions of the principal invariants are given below.
Principal invariants of
S
{\displaystyle {\boldsymbol {S}}}
I
1
=
tr
S
=
λ
1
+
λ
2
+
λ
3
I
2
=
1
2
[
(
tr
S
)
2
−
tr
(
S
2
)
]
=
λ
1
λ
2
+
λ
2
λ
3
+
λ
3
λ
1
I
3
=
det
S
=
λ
1
λ
2
λ
3
{\displaystyle {\begin{aligned}I_{1}&={\text{tr}}~{\boldsymbol {S}}=\lambda _{1}+\lambda _{2}+\lambda _{3}\\I_{2}&={\cfrac {1}{2}}\left[({\text{tr}}~{\boldsymbol {S}})^{2}-{\text{tr}}({\boldsymbol {S^{2}}})\right]=\lambda _{1}~\lambda _{2}+\lambda _{2}~\lambda _{3}+\lambda _{3}~\lambda _{1}\\I_{3}&=\det {\boldsymbol {S}}=\lambda _{1}~\lambda _{2}~\lambda _{3}\end{aligned}}}
Note that
λ
{\displaystyle \lambda }
is an eigenvalue of
S
{\displaystyle {\boldsymbol {S}}}
if and only if
det
(
S
−
λ
I
)
=
0
{\displaystyle \det({\boldsymbol {S}}-\lambda ~{\boldsymbol {\mathit {I}}})=0}
The resulting equations is called the characteristic equation and is usually written in expanded form as
λ
3
−
I
1
(
S
)
λ
2
+
I
2
(
S
)
λ
−
I
3
(
S
)
=
0
{\displaystyle \lambda ^{3}-I_{1}({\boldsymbol {S}})~\lambda ^{2}+I_{2}({\boldsymbol {S}})~\lambda -I_{3}({\boldsymbol {S}})=0}
Cayley-Hamilton theorem
edit
The Cayley-Hamilton theorem is a very useful result in continuum mechanics. It states that
If
S
{\displaystyle {\boldsymbol {S}}}
is a second order tensor then it satisfies its own characteristic equation
S
3
−
I
1
(
S
)
S
2
+
I
2
(
S
)
S
−
I
3
(
S
)
1
=
0
{\displaystyle {\boldsymbol {S}}^{3}-I_{1}({\boldsymbol {S}})~{\boldsymbol {S}}^{2}+I_{2}({\boldsymbol {S}})~{\boldsymbol {S}}-I_{3}({\boldsymbol {S}})~{\boldsymbol {\mathit {1}}}={\boldsymbol {\mathit {0}}}}
All the equations so far have made no mention of the coordinate system. When we use vectors and tensor in computations we have to express them in some coordinate system (basis) and use the components of the object in that basis for our computations.
Commonly used bases are the Cartesian coordinate frame, the cylindrical coordinate frame, and the spherical coordinate frame.
A Cartesian coordinate frame consists of an orthonormal basis
(
e
1
,
e
2
,
e
3
)
{\displaystyle (\mathbf {e} _{1},\mathbf {e} _{2},\mathbf {e} _{3})\,}
together with a point
o
{\displaystyle \mathbf {o} \,}
called the origin. Since these vectors are mutually perpendicular, we have the following relations:
(1)
e
1
∙
e
1
=
1
;
e
1
∙
e
2
=
0
;
e
1
∙
e
3
=
0
;
e
2
∙
e
1
=
0
;
e
2
∙
e
2
=
1
;
e
2
∙
e
3
=
0
;
e
3
∙
e
1
=
0
;
e
3
∙
e
2
=
0
;
e
3
∙
e
3
=
1
.
{\displaystyle {\begin{aligned}{\text{(1)}}\qquad \mathbf {e} _{1}\bullet \mathbf {e} _{1}&=1~;~~\mathbf {e} _{1}\bullet \mathbf {e} _{2}=0~;~~\mathbf {e} _{1}\bullet \mathbf {e} _{3}=0~;\\\mathbf {e} _{2}\bullet \mathbf {e} _{1}&=0~;~~\mathbf {e} _{2}\bullet \mathbf {e} _{2}=1~;~~\mathbf {e} _{2}\bullet \mathbf {e} _{3}=0~;\\\mathbf {e} _{3}\bullet \mathbf {e} _{1}&=0~;~~\mathbf {e} _{3}\bullet \mathbf {e} _{2}=0~;~~\mathbf {e} _{3}\bullet \mathbf {e} _{3}=1~.\end{aligned}}}
To make the above relations more compact, we introduce the Kronecker delta symbol
δ
i
j
=
{
1
i
f
i
=
j
.
0
i
f
i
≠
j
.
{\displaystyle {\delta _{ij}={\begin{cases}1&~{\rm {{if}~i=j~.}}\\0&~{\rm {{if}~i\neq j~.}}\end{cases}}}}
Then, instead of the nine equations in (1) we can write (in index notation )
e
i
∙
e
j
=
δ
i
j
.
{\displaystyle \mathbf {e} _{i}\bullet \mathbf {e} _{j}=\delta _{ij}~.}
Einstein summation convention
edit
Recall that the vector
u
{\displaystyle \mathbf {u} \,}
can be written as
(2)
u
=
u
1
e
1
+
u
2
e
2
+
u
3
e
3
=
∑
i
=
1
3
u
i
e
i
.
{\displaystyle {\text{(2)}}\qquad \mathbf {u} =u_{1}\mathbf {e} _{1}+u_{2}\mathbf {e} _{2}+u_{3}\mathbf {e} _{3}=\sum _{i=1}^{3}u_{i}\mathbf {e} _{i}~.}
In index notation, equation (2) can be written as
u
=
u
i
e
i
.
{\displaystyle {\mathbf {u} =u_{i}\mathbf {e} _{i}~.}}
This convention is called the Einstein summation convention . If indices are repeated, we understand that to mean that there is a sum over the indices.
Components of a vector
edit
We can write the Cartesian components of a vector
u
{\displaystyle \mathbf {u} \,}
in the basis
(
e
1
,
e
2
,
e
3
)
{\displaystyle (\mathbf {e} _{1},\mathbf {e} _{2},\mathbf {e} _{3})\,}
as
u
i
=
e
i
∙
u
,
i
=
1
,
2
,
3
.
{\displaystyle u_{i}=\mathbf {e} _{i}\bullet \mathbf {u} ~,~~~i=1,2,3~.}
Components of a tensor
edit
Similarly, the components
A
i
j
{\displaystyle A_{ij}\,}
of a tensor
A
{\displaystyle {\boldsymbol {A}}\,}
are defined by
A
i
j
=
e
i
∙
(
A
∙
e
j
)
.
{\displaystyle {A_{ij}=\mathbf {e} _{i}\bullet ({\boldsymbol {A}}\bullet \mathbf {e} _{j})~.}}
Using the definition of the tensor product, we can also write
A
=
∑
i
,
j
=
1
3
A
i
j
e
i
e
j
≡
∑
i
,
j
=
1
3
A
i
j
e
i
⊗
e
j
.
{\displaystyle {\boldsymbol {A}}=\sum _{i,j=1}^{3}A_{ij}\mathbf {e} _{i}\mathbf {e} _{j}\equiv \sum _{i,j=1}^{3}A_{ij}\mathbf {e} _{i}\otimes \mathbf {e} _{j}~.}
Using the summation convention,
A
=
A
i
j
e
i
e
j
≡
A
i
j
e
i
⊗
e
j
.
{\displaystyle {{\boldsymbol {A}}=A_{ij}\mathbf {e} _{i}\mathbf {e} _{j}\equiv A_{ij}\mathbf {e} _{i}\otimes \mathbf {e} _{j}~.}}
In this case, the bases of the tensor are
{
e
i
⊗
e
j
}
{\displaystyle \{\mathbf {e} _{i}\otimes \mathbf {e} _{j}\}}
and the components are
A
i
j
{\displaystyle A_{ij}\,}
.
Operation of a tensor on a vector
edit
From the definition of the components of tensor
A
{\displaystyle {\boldsymbol {A}}\,}
, we can also see that (using the summation convention)
v
=
A
∙
u
≡
v
i
=
A
i
j
u
j
.
{\displaystyle {\mathbf {v} ={\boldsymbol {A}}\bullet \mathbf {u} ~~~\equiv ~~~v_{i}=A_{ij}u_{j}~.}}
Similarly, the dyadic product can be expressed as
(
a
b
)
i
j
≡
(
a
⊗
b
)
i
j
=
a
i
b
j
.
{\displaystyle {(\mathbf {a} \mathbf {b} )_{ij}\equiv (\mathbf {a} \otimes \mathbf {b} )_{ij}=a_{i}b_{j}~.}}
We can also write a tensor
A
{\displaystyle {\boldsymbol {A}}}
in matrix notation as
A
=
A
i
j
e
i
e
j
=
A
i
j
e
i
⊗
e
j
⟹
A
=
[
A
11
A
12
A
13
A
21
A
22
A
23
A
31
A
32
A
33
]
.
{\displaystyle {\boldsymbol {A}}=A_{ij}\mathbf {e} _{i}\mathbf {e} _{j}=A_{ij}\mathbf {e} _{i}\otimes \mathbf {e} _{j}\implies \mathbf {A} ={\begin{bmatrix}A_{11}&A_{12}&A_{13}\\A_{21}&A_{22}&A_{23}\\A_{31}&A_{32}&A_{33}\end{bmatrix}}~.}
Note that the Kronecker delta represents the components of the identity tensor in a Cartesian basis. Therefore, we can write
I
=
δ
i
j
e
i
e
j
=
δ
i
j
e
i
⊗
e
j
⟹
I
=
[
1
0
0
0
1
0
0
0
1
]
.
{\displaystyle {\boldsymbol {I}}=\delta _{ij}\mathbf {e} _{i}\mathbf {e} _{j}=\delta _{ij}\mathbf {e} _{i}\otimes \mathbf {e} _{j}\implies \mathbf {I} ={\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}}~.}
Tensor inner product
edit
The inner product
A
:
B
{\displaystyle {\boldsymbol {A}}:{\boldsymbol {B}}\,}
of two tensors
A
{\displaystyle {\boldsymbol {A}}\,}
and
B
{\displaystyle {\boldsymbol {B}}\,}
is an operation that generates a scalar. We define (summation implied)
A
:
B
=
A
i
j
B
i
j
.
{\displaystyle {{\boldsymbol {A}}:{\boldsymbol {B}}=A_{ij}B_{ij}~.}}
The inner product can also be expressed using the trace :
A
:
B
=
T
r
(
A
T
∙
B
)
.
{\displaystyle {{\boldsymbol {A}}:{\boldsymbol {B}}=Tr({\boldsymbol {A^{T}}}\bullet {\boldsymbol {B}})~.}}
Proof using the definition of the trace below :
T
r
(
A
T
∙
B
)
=
I
:
(
A
T
∙
B
)
=
δ
i
j
e
i
⊗
e
j
:
(
A
l
k
e
k
⊗
e
l
∙
B
m
n
e
m
⊗
e
n
)
=
δ
i
j
e
i
⊗
e
j
:
(
A
m
k
B
m
n
e
k
⊗
e
n
)
=
{\displaystyle {Tr({\boldsymbol {A^{T}}}\bullet {\boldsymbol {B}})={\boldsymbol {I}}:({\boldsymbol {A^{T}}}\bullet {\boldsymbol {B}})=\delta _{ij}\mathbf {e} _{i}\otimes \mathbf {e} _{j}:(A_{lk}\mathbf {e} _{k}\otimes \mathbf {e} _{l}\bullet B_{mn}\mathbf {e} _{m}\otimes \mathbf {e} _{n})=\delta _{ij}\mathbf {e} _{i}\otimes \mathbf {e} _{j}:(A_{mk}B_{mn}\mathbf {e} _{k}\otimes \mathbf {e} _{n})=}}
A
m
k
B
m
n
δ
i
j
δ
i
n
δ
j
k
=
A
m
k
B
m
i
δ
i
j
δ
j
k
=
A
m
k
B
m
j
δ
j
k
=
A
m
j
B
m
j
=
A
:
B
{\displaystyle {A_{mk}B_{mn}\delta _{ij}\delta _{in}\delta _{jk}=A_{mk}B_{mi}\delta _{ij}\delta _{jk}=A_{mk}B_{mj}\delta _{jk}=A_{mj}B_{mj}=A:B}}
The trace of a tensor is the scalar given by
Tr
(
A
)
=
I
:
A
=
δ
i
j
e
i
⊗
e
j
:
A
m
n
e
m
⊗
e
n
=
δ
i
j
δ
i
m
δ
j
n
A
m
n
=
A
i
i
{\displaystyle {\text{Tr}}({\boldsymbol {A}})={\boldsymbol {I}}:{\boldsymbol {A}}=\delta _{ij}\mathbf {e} _{i}\otimes \mathbf {e} _{j}:A_{mn}\mathbf {e} _{m}\otimes \mathbf {e} _{n}=\delta _{ij}\delta _{im}\delta _{jn}A_{mn}=A_{ii}}
The trace of an N x N-matrix is the sum of the components on the downward-sloping diagonal.
Magnitude of a tensor
edit
The magnitude of a tensor
A
{\displaystyle {\boldsymbol {A}}\,}
is defined by
‖
A
‖
=
A
:
A
≡
A
i
j
A
i
j
.
{\displaystyle \Vert {\boldsymbol {A}}\Vert ={\sqrt {{\boldsymbol {A}}:{\boldsymbol {A}}}}\equiv {\sqrt {A_{ij}A_{ij}}}~.}
Tensor product of a tensor with a vector
edit
Another tensor operation that is often seen is the tensor product of a tensor with a vector . Let
A
{\displaystyle {\boldsymbol {A}}\,}
be a tensor and let
v
{\displaystyle \mathbf {v} \,}
be a vector. Then the tensor cross product gives a tensor
C
{\displaystyle {\boldsymbol {C}}\,}
defined by
C
=
A
×
v
⟹
C
i
j
=
e
k
l
j
A
i
k
v
l
.
{\displaystyle {{\boldsymbol {C}}={\boldsymbol {A}}\times \mathbf {v} \implies C_{ij}=e_{klj}A_{ik}v_{l}~.}}
The permutation symbol
e
i
j
k
{\displaystyle e_{ijk}\,}
is defined as
e
i
j
k
=
{
1
if
i
j
k
=
123
,
231
,
or
312
−
1
if
i
j
k
=
321
,
132
,
or
213
0
if any two indices are alike
{\displaystyle {e_{ijk}={\begin{cases}1&~{\text{if}}~ijk=123,231,~{\text{or}}~312\\-1&~{\text{if}}~ijk=321,132,~{\text{or}}~213\\0&~{\text{if any two indices are alike}}\end{cases}}}}
Identities in tensor algebra
edit
Let
A
{\displaystyle {\boldsymbol {A}}}
,
B
{\displaystyle {\boldsymbol {B}}}
and
C
{\displaystyle {\boldsymbol {C}}}
be three second order tensors. Then
A
:
(
B
⋅
C
)
=
(
C
⋅
A
T
)
:
B
T
=
(
B
T
⋅
A
)
:
C
{\displaystyle {\boldsymbol {A}}:({\boldsymbol {B}}\cdot {\boldsymbol {C}})=({\boldsymbol {C}}\cdot {\boldsymbol {A}}^{T}):{\boldsymbol {B}}^{T}=({\boldsymbol {B}}^{T}\cdot {\boldsymbol {A}}):{\boldsymbol {C}}}
Proof:
It is easiest to show these relations by using index notation with respect to an orthonormal basis. Then we can write
A
:
(
B
⋅
C
)
≡
A
i
j
(
B
i
k
C
k
j
)
=
C
k
j
A
j
i
T
B
k
i
T
≡
(
C
⋅
A
T
)
:
B
T
{\displaystyle {\boldsymbol {A}}:({\boldsymbol {B}}\cdot {\boldsymbol {C}})\equiv A_{ij}(B_{ik}~C_{kj})=C_{kj}~A_{ji}^{T}~B_{ki}^{T}\equiv ({\boldsymbol {C}}\cdot {\boldsymbol {A}}^{T}):{\boldsymbol {B}}^{T}}
Similarly,
A
:
(
B
⋅
C
)
≡
A
i
j
(
B
i
k
C
k
j
)
=
B
k
i
T
A
i
j
C
k
j
≡
(
B
T
⋅
A
)
:
C
{\displaystyle {\boldsymbol {A}}:({\boldsymbol {B}}\cdot {\boldsymbol {C}})\equiv A_{ij}(B_{ik}~C_{kj})=B_{ki}^{T}~A_{ij}~C_{kj}\equiv ({\boldsymbol {B}}^{T}\cdot {\boldsymbol {A}}):{\boldsymbol {C}}}
Recall that the vector differential operator (with respect to a Cartesian basis) is defined as
∇
=
∂
∂
x
1
e
1
+
∂
∂
x
2
e
2
+
∂
∂
x
3
e
3
≡
∂
∂
x
i
e
i
.
{\displaystyle {\boldsymbol {\nabla }}{}={\cfrac {\partial }{\partial x_{1}}}\mathbf {e} _{1}+{\cfrac {\partial }{\partial x_{2}}}\mathbf {e} _{2}+{\cfrac {\partial }{\partial x_{3}}}\mathbf {e} _{3}\equiv {\cfrac {\partial }{\partial x_{i}}}\mathbf {e} _{i}~.}
In this section we summarize some operations of
∇
{\displaystyle {\boldsymbol {\nabla }}{}}
on vectors and tensors.
The gradient of a vector field
edit
The dyadic product
∇
v
{\displaystyle {\boldsymbol {\nabla }}{\mathbf {v} }\,}
(or
∇
⊗
v
{\displaystyle {\boldsymbol {\nabla }}{}\otimes \mathbf {v} }
) is called the gradient of the vector field
v
{\displaystyle \mathbf {v} \,}
. Therefore, the quantity
∇
v
{\displaystyle {\boldsymbol {\nabla }}{\mathbf {v} }}
is a tensor given by
∇
v
=
∑
i
∑
j
∂
v
j
∂
x
i
e
i
e
j
≡
v
j
,
i
e
i
e
j
.
{\displaystyle {{\boldsymbol {\nabla }}{\mathbf {v} }=\sum _{i}\sum _{j}{\cfrac {\partial v_{j}}{\partial x_{i}}}\mathbf {e} _{i}\mathbf {e} _{j}\equiv v_{j,i}\mathbf {e} _{i}\mathbf {e} _{j}~.}}
In the alternative dyadic notation,
∇
v
≡
∇
⊗
v
=
∑
i
∑
j
∂
v
j
∂
x
i
e
i
⊗
e
j
≡
v
j
,
i
e
i
⊗
e
j
.
{\displaystyle {{\boldsymbol {\nabla }}{\mathbf {v} }\equiv {\boldsymbol {\nabla }}{}\otimes \mathbf {v} =\sum _{i}\sum _{j}{\cfrac {\partial v_{j}}{\partial x_{i}}}\mathbf {e} _{i}\otimes \mathbf {e} _{j}\equiv v_{j,i}\mathbf {e} _{i}\otimes \mathbf {e} _{j}~.}}
'Warning: Some authors define the
i
j
{\displaystyle ij}
component of
∇
v
{\displaystyle {\boldsymbol {\nabla }}{\mathbf {v} }}
as
∂
v
i
/
∂
x
j
=
v
i
,
j
{\displaystyle \partial v_{i}/\partial x_{j}=v_{i,j}}
.
The divergence of a tensor field
edit
Let
A
{\displaystyle {\boldsymbol {A}}\,}
be a tensor field. Then the divergence of the tensor field is a vector
∇
∙
A
{\displaystyle {\boldsymbol {\nabla }}\bullet {\boldsymbol {A}}}
given by
∇
∙
A
=
∑
j
[
∑
i
∂
A
i
j
∂
x
i
]
e
j
≡
∂
A
i
j
∂
x
i
e
j
=
A
i
j
,
i
e
j
.
{\displaystyle {{\boldsymbol {\nabla }}\bullet {\boldsymbol {A}}=\sum _{j}\left[\sum _{i}{\cfrac {\partial A_{ij}}{\partial x_{i}}}\right]\mathbf {e} _{j}\equiv {\cfrac {\partial A_{ij}}{\partial x_{i}}}\mathbf {e} _{j}=A_{ij,i}\mathbf {e} _{j}~.}}
To fix the definition of divergence of a general tensor field (possibly of higher order than 2), we use the relation
(
∇
∙
A
)
∙
c
=
∇
∙
(
A
∙
c
)
{\displaystyle ({\boldsymbol {\nabla }}\bullet {\boldsymbol {A}})\bullet \mathbf {c} ={\boldsymbol {\nabla }}\bullet ({\boldsymbol {A}}\bullet \mathbf {c} )}
where
c
{\displaystyle \mathbf {c} }
is an arbitrary constant vector.
The Laplacian of a vector field
edit
The Laplacian of a vector field is given by
∇
2
v
=
∇
∙
∇
v
=
∑
j
[
∑
i
∂
2
v
j
∂
x
i
2
]
e
j
≡
v
j
,
i
i
e
j
.
{\displaystyle {\nabla ^{2}{\mathbf {v} }={\boldsymbol {\nabla }}\bullet {{\boldsymbol {\nabla }}{\mathbf {v} }}=\sum _{j}\left[\sum _{i}{\cfrac {\partial ^{2}v_{j}}{\partial x_{i}^{2}}}\right]\mathbf {e} _{j}\equiv v_{j,ii}\mathbf {e} _{j}~.}}
Some important identities involving tensors are:
∇
∙
∇
v
=
∇
(
∇
∙
v
)
−
∇
×
(
∇
×
v
)
{\displaystyle {\boldsymbol {\nabla }}\bullet {{\boldsymbol {\nabla }}{\mathbf {v} }}={\boldsymbol {\nabla }}{({\boldsymbol {\nabla }}\bullet {\mathbf {v} })}-{\boldsymbol {\nabla }}\times {({\boldsymbol {\nabla }}\times {\mathbf {v} })}}
.
v
∙
∇
v
=
1
2
∇
(
v
∙
v
)
−
v
×
(
∇
×
v
)
{\displaystyle \mathbf {v} \bullet {\boldsymbol {\nabla }}{\mathbf {v} }={\frac {1}{2}}{\boldsymbol {\nabla }}{(\mathbf {v} \bullet \mathbf {v} )}-\mathbf {v} \times ({\boldsymbol {\nabla }}\times {\mathbf {v} )}}
.
∇
∙
(
v
⊗
w
)
=
v
∙
∇
w
+
w
(
∇
∙
v
)
{\displaystyle {\boldsymbol {\nabla }}\bullet {(\mathbf {v} \otimes \mathbf {w} )}=\mathbf {v} \bullet {\boldsymbol {\nabla }}{\mathbf {w} }+\mathbf {w} ({\boldsymbol {\nabla }}\bullet {\mathbf {v} })}
.
∇
∙
(
φ
A
)
=
∇
φ
∙
A
+
φ
∇
∙
A
{\displaystyle {\boldsymbol {\nabla }}\bullet {(\varphi {\boldsymbol {A}})}={\boldsymbol {\nabla }}{\varphi }\bullet {\boldsymbol {A}}+\varphi {\boldsymbol {\nabla }}\bullet {\boldsymbol {A}}}
.
∇
(
v
∙
w
)
=
(
∇
v
)
∙
w
+
(
∇
w
)
∙
v
{\displaystyle {\boldsymbol {\nabla }}{(\mathbf {v} \bullet \mathbf {w} )}=({\boldsymbol {\nabla }}{\mathbf {v} })\bullet \mathbf {w} +({\boldsymbol {\nabla }}{\mathbf {w} })\bullet \mathbf {v} }
.
∇
∙
(
A
∙
w
)
=
(
∇
∙
A
)
∙
w
+
A
T
:
(
∇
w
)
{\displaystyle {\boldsymbol {\nabla }}\bullet {({\boldsymbol {A}}\bullet \mathbf {w} )}=({\boldsymbol {\nabla }}\bullet {\boldsymbol {A}})\bullet \mathbf {w} +{\boldsymbol {A}}^{T}:({\boldsymbol {\nabla }}{\mathbf {w} })}
.
The following integral theorems are useful in continuum mechanics and finite elements.
The Gauss divergence theorem
edit
If
Ω
{\displaystyle \Omega }
is a region in space enclosed by a surface
Γ
{\displaystyle \Gamma \,}
and
A
{\displaystyle {\boldsymbol {A}}\,}
is a tensor field, then
∫
Ω
∇
∙
A
d
V
=
∫
Γ
n
∙
A
d
A
{\displaystyle {\int _{\Omega }{\boldsymbol {\nabla }}\bullet {\boldsymbol {A}}~dV=\int _{\Gamma }\mathbf {n} \bullet {\boldsymbol {A}}~dA}}
where
n
{\displaystyle \mathbf {n} \,}
is the unit outward normal to the surface.
The Stokes curl theorem
edit
If
Γ
{\displaystyle \Gamma \,}
is a surface bounded by a closed curve
C
{\displaystyle {\mathcal {C}}}
, then
∫
Γ
n
∙
(
∇
×
A
)
d
A
=
∮
C
t
∙
A
d
s
{\displaystyle \int _{\Gamma }\mathbf {n} \bullet ({\boldsymbol {\nabla }}\times {{\boldsymbol {A}})}~dA=\oint _{\mathcal {C}}\mathbf {t} \bullet {\boldsymbol {A}}~ds}
where
A
{\displaystyle {\boldsymbol {A}}\,}
is a tensor field,
n
{\displaystyle \mathbf {n} \,}
is the unit normal vector to
Γ
{\displaystyle \Gamma \,}
in the direction of a right-handed screw motion along
C
{\displaystyle {\mathcal {C}}}
, and
t
{\displaystyle \mathbf {t} \,}
is a unit tangential vector in the direction of integration along
C
{\displaystyle {\mathcal {C}}}
.
Let
Ω
{\displaystyle \Omega }
be a closed moving region of space enclosed by a surface
Γ
{\displaystyle \Gamma \,}
. Let the velocity of any surface element be
v
{\displaystyle \mathbf {v} \,}
. Then if
A
(
x
,
t
)
{\displaystyle {\boldsymbol {A}}(\mathbf {x} ,t)\,}
is a tensor function of position and time,
∂
∂
t
∫
Ω
A
d
V
=
∫
Ω
∂
A
∂
t
d
V
+
∫
Γ
A
(
v
∙
n
)
d
A
{\displaystyle {\cfrac {\partial }{\partial t}}\int _{\Omega }{\boldsymbol {A}}~dV=\int _{\Omega }{\cfrac {\partial {\boldsymbol {A}}}{\partial t}}~dV+\int _{\Gamma }{\boldsymbol {A}}(\mathbf {v} \bullet \mathbf {n} )~dA}
where
n
{\displaystyle \mathbf {n} \,}
is the outward unit normal to the surface
Γ
{\displaystyle \Gamma \,}
.
Directional derivatives
edit
We often have to find the derivatives of vectors with respect to vectors and of tensors with respect to vectors and tensors. The directional directive provides a systematic way of finding these derivatives.
The definitions of directional derivatives for various situations are
given below. It is assumed that the functions are sufficiently smooth
that derivatives can be taken.
Derivatives of scalar valued functions of vectors
edit
Let
f
(
v
)
{\displaystyle f(\mathbf {v} )}
be a real valued function of the vector
v
{\displaystyle \mathbf {v} }
. Then the
derivative of
f
(
v
)
{\displaystyle f(\mathbf {v} )}
with respect to
v
{\displaystyle \mathbf {v} }
(or at
v
{\displaystyle \mathbf {v} }
) in the direction
u
{\displaystyle \mathbf {u} }
is the vector defined as
∂
f
∂
v
⋅
u
=
D
f
(
v
)
[
u
]
=
[
∂
∂
α
f
(
v
+
α
u
)
]
α
=
0
{\displaystyle {\frac {\partial f}{\partial \mathbf {v} }}\cdot \mathbf {u} =Df(\mathbf {v} )[\mathbf {u} ]=\left[{\frac {\partial }{\partial \alpha }}~f(\mathbf {v} +\alpha ~\mathbf {u} )\right]_{\alpha =0}}
for all vectors
u
{\displaystyle \mathbf {u} }
.
Properties:
1) If
f
(
v
)
=
f
1
(
v
)
+
f
2
(
v
)
{\displaystyle f(\mathbf {v} )=f_{1}(\mathbf {v} )+f_{2}(\mathbf {v} )}
then
∂
f
∂
v
⋅
u
=
(
∂
f
1
∂
v
+
∂
f
2
∂
v
)
⋅
u
{\displaystyle {\frac {\partial f}{\partial \mathbf {v} }}\cdot \mathbf {u} =\left({\frac {\partial f_{1}}{\partial \mathbf {v} }}+{\frac {\partial f_{2}}{\partial \mathbf {v} }}\right)\cdot \mathbf {u} }
2) If
f
(
v
)
=
f
1
(
v
)
f
2
(
v
)
{\displaystyle f(\mathbf {v} )=f_{1}(\mathbf {v} )~f_{2}(\mathbf {v} )}
then
∂
f
∂
v
⋅
u
=
(
∂
f
1
∂
v
⋅
u
)
f
2
(
v
)
+
f
1
(
v
)
(
∂
f
2
∂
v
⋅
u
)
{\displaystyle {\frac {\partial f}{\partial \mathbf {v} }}\cdot \mathbf {u} =\left({\frac {\partial f_{1}}{\partial \mathbf {v} }}\cdot \mathbf {u} \right)~f_{2}(\mathbf {v} )+f_{1}(\mathbf {v} )~\left({\frac {\partial f_{2}}{\partial \mathbf {v} }}\cdot \mathbf {u} \right)}
3) If
f
(
v
)
=
f
1
(
f
2
(
v
)
)
{\displaystyle f(\mathbf {v} )=f_{1}(f_{2}(\mathbf {v} ))}
then
∂
f
∂
v
⋅
u
=
∂
f
1
∂
f
2
∂
f
2
∂
v
⋅
u
{\displaystyle {\frac {\partial f}{\partial \mathbf {v} }}\cdot \mathbf {u} ={\frac {\partial f_{1}}{\partial f_{2}}}~{\frac {\partial f_{2}}{\partial \mathbf {v} }}\cdot \mathbf {u} }
Derivatives of vector valued functions of vectors
edit
Let
f
(
v
)
{\displaystyle \mathbf {f} (\mathbf {v} )}
be a vector valued function of the vector
v
{\displaystyle \mathbf {v} }
. Then the
derivative of
f
(
v
)
{\displaystyle \mathbf {f} (\mathbf {v} )}
with respect to
v
{\displaystyle \mathbf {v} }
(or at
v
{\displaystyle \mathbf {v} }
) in the direction
u
{\displaystyle \mathbf {u} }
is the second order tensor defined as
∂
f
∂
v
⋅
u
=
D
f
(
v
)
[
u
]
=
[
∂
∂
α
f
(
v
+
α
u
)
]
α
=
0
{\displaystyle {\frac {\partial \mathbf {f} }{\partial \mathbf {v} }}\cdot \mathbf {u} =D\mathbf {f} (\mathbf {v} )[\mathbf {u} ]=\left[{\frac {\partial }{\partial \alpha }}~\mathbf {f} (\mathbf {v} +\alpha ~\mathbf {u} )\right]_{\alpha =0}}
for all vectors
u
{\displaystyle \mathbf {u} }
.
Properties:
1) If
f
(
v
)
=
f
1
(
v
)
+
f
2
(
v
)
{\displaystyle \mathbf {f} (\mathbf {v} )=\mathbf {f} _{1}(\mathbf {v} )+\mathbf {f} _{2}(\mathbf {v} )}
then
∂
f
∂
v
⋅
u
=
(
∂
f
1
∂
v
+
∂
f
2
∂
v
)
⋅
u
{\displaystyle {\frac {\partial \mathbf {f} }{\partial \mathbf {v} }}\cdot \mathbf {u} =\left({\frac {\partial \mathbf {f} _{1}}{\partial \mathbf {v} }}+{\frac {\partial \mathbf {f} _{2}}{\partial \mathbf {v} }}\right)\cdot \mathbf {u} }
2) If
f
(
v
)
=
f
1
(
v
)
×
f
2
(
v
)
{\displaystyle \mathbf {f} (\mathbf {v} )=\mathbf {f} _{1}(\mathbf {v} )\times \mathbf {f} _{2}(\mathbf {v} )}
then
∂
f
∂
v
⋅
u
=
(
∂
f
1
∂
v
⋅
u
)
×
f
2
(
v
)
+
f
1
(
v
)
×
(
∂
f
2
∂
v
⋅
u
)
{\displaystyle {\frac {\partial \mathbf {f} }{\partial \mathbf {v} }}\cdot \mathbf {u} =\left({\frac {\partial \mathbf {f} _{1}}{\partial \mathbf {v} }}\cdot \mathbf {u} \right)\times \mathbf {f} _{2}(\mathbf {v} )+\mathbf {f} _{1}(\mathbf {v} )\times \left({\frac {\partial \mathbf {f} _{2}}{\partial \mathbf {v} }}\cdot \mathbf {u} \right)}
3) If
f
(
v
)
=
f
1
(
f
2
(
v
)
)
{\displaystyle \mathbf {f} (\mathbf {v} )=\mathbf {f} _{1}(\mathbf {f} _{2}(\mathbf {v} ))}
then
∂
f
∂
v
⋅
u
=
∂
f
1
∂
f
2
⋅
(
∂
f
2
∂
v
⋅
u
)
{\displaystyle {\frac {\partial \mathbf {f} }{\partial \mathbf {v} }}\cdot \mathbf {u} ={\frac {\partial \mathbf {f} _{1}}{\partial \mathbf {f} _{2}}}\cdot \left({\frac {\partial \mathbf {f} _{2}}{\partial \mathbf {v} }}\cdot \mathbf {u} \right)}
Derivatives of scalar valued functions of tensors
edit
Let
f
(
S
)
{\displaystyle f({\boldsymbol {S}})}
be a real valued function of the second order tensor
S
{\displaystyle {\boldsymbol {S}}}
. Then
the derivative of
f
(
S
)
{\displaystyle f({\boldsymbol {S}})}
with respect to
S
{\displaystyle {\boldsymbol {S}}}
(or at
S
{\displaystyle {\boldsymbol {S}}}
) in the direction
T
{\displaystyle {\boldsymbol {T}}}
is the second order tensor defined as
∂
f
∂
S
:
T
=
D
f
(
S
)
[
T
]
=
[
∂
∂
α
f
(
S
+
α
T
)
]
α
=
0
{\displaystyle {\frac {\partial f}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}=Df({\boldsymbol {S}})[{\boldsymbol {T}}]=\left[{\frac {\partial }{\partial \alpha }}~f({\boldsymbol {S}}+\alpha ~{\boldsymbol {T}})\right]_{\alpha =0}}
for all second order tensors
T
{\displaystyle {\boldsymbol {T}}}
.
Properties:
1) If
f
(
S
)
=
f
1
(
S
)
+
f
2
(
S
)
{\displaystyle f({\boldsymbol {S}})=f_{1}({\boldsymbol {S}})+f_{2}({\boldsymbol {S}})}
then
∂
f
∂
S
:
T
=
(
∂
f
1
∂
S
+
∂
f
2
∂
S
)
:
T
{\displaystyle {\frac {\partial f}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}=\left({\frac {\partial f_{1}}{\partial {\boldsymbol {S}}}}+{\frac {\partial f_{2}}{\partial {\boldsymbol {S}}}}\right):{\boldsymbol {T}}}
2) If
f
(
S
)
=
f
1
(
S
)
f
2
(
S
)
{\displaystyle f({\boldsymbol {S}})=f_{1}({\boldsymbol {S}})~f_{2}({\boldsymbol {S}})}
then
∂
f
∂
S
:
T
=
(
∂
f
1
∂
S
:
T
)
f
2
(
S
)
+
f
1
(
S
)
(
∂
f
2
∂
S
:
T
)
{\displaystyle {\frac {\partial f}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}=\left({\frac {\partial f_{1}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}\right)~f_{2}({\boldsymbol {S}})+f_{1}({\boldsymbol {S}})~\left({\frac {\partial f_{2}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}\right)}
3) If
f
(
S
)
=
f
1
(
f
2
(
S
)
)
{\displaystyle f({\boldsymbol {S}})=f_{1}(f_{2}({\boldsymbol {S}}))}
then
∂
f
∂
S
:
T
=
∂
f
1
∂
f
2
(
∂
f
2
∂
S
:
T
)
{\displaystyle {\frac {\partial f}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}={\frac {\partial f_{1}}{\partial f_{2}}}~\left({\frac {\partial f_{2}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}\right)}
Derivatives of tensor valued functions of tensors
edit
Let
F
(
S
)
{\displaystyle {\boldsymbol {F}}({\boldsymbol {S}})}
be a second order tensor valued function of the second order
tensor
S
{\displaystyle {\boldsymbol {S}}}
. Then the derivative of
F
(
S
)
{\displaystyle {\boldsymbol {F}}({\boldsymbol {S}})}
with respect to
S
{\displaystyle {\boldsymbol {S}}}
(or at
S
{\displaystyle {\boldsymbol {S}}}
) in the direction
T
{\displaystyle {\boldsymbol {T}}}
is the fourth order tensor defined as
∂
F
∂
S
:
T
=
D
F
(
S
)
[
T
]
=
[
∂
∂
α
F
(
S
+
α
T
)
]
α
=
0
{\displaystyle {\frac {\partial {\boldsymbol {F}}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}=D{\boldsymbol {F}}({\boldsymbol {S}})[{\boldsymbol {T}}]=\left[{\frac {\partial }{\partial \alpha }}~{\boldsymbol {F}}({\boldsymbol {S}}+\alpha ~{\boldsymbol {T}})\right]_{\alpha =0}}
for all second order tensors
T
{\displaystyle {\boldsymbol {T}}}
.
Properties:
1) If
F
(
S
)
=
F
1
(
S
)
+
F
2
(
S
)
{\displaystyle {\boldsymbol {F}}({\boldsymbol {S}})={\boldsymbol {F}}_{1}({\boldsymbol {S}})+{\boldsymbol {F}}_{2}({\boldsymbol {S}})}
then
∂
F
∂
S
:
T
=
(
∂
F
1
∂
S
+
∂
F
2
∂
S
)
:
T
{\displaystyle {\frac {\partial {\boldsymbol {F}}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}=\left({\frac {\partial {\boldsymbol {F}}_{1}}{\partial {\boldsymbol {S}}}}+{\frac {\partial {\boldsymbol {F}}_{2}}{\partial {\boldsymbol {S}}}}\right):{\boldsymbol {T}}}
2) If
F
(
S
)
=
F
1
(
S
)
⋅
F
2
(
S
)
{\displaystyle {\boldsymbol {F}}({\boldsymbol {S}})={\boldsymbol {F}}_{1}({\boldsymbol {S}})\cdot {\boldsymbol {F}}_{2}({\boldsymbol {S}})}
then
∂
F
∂
S
:
T
=
(
∂
F
1
∂
S
:
T
)
⋅
F
2
(
S
)
+
F
1
(
S
)
⋅
(
∂
F
2
∂
S
:
T
)
{\displaystyle {\frac {\partial {\boldsymbol {F}}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}=\left({\frac {\partial {\boldsymbol {F}}_{1}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}\right)\cdot {\boldsymbol {F}}_{2}({\boldsymbol {S}})+{\boldsymbol {F}}_{1}({\boldsymbol {S}})\cdot \left({\frac {\partial {\boldsymbol {F}}_{2}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}\right)}
3) If
F
(
S
)
=
F
1
(
F
2
(
S
)
)
{\displaystyle {\boldsymbol {F}}({\boldsymbol {S}})={\boldsymbol {F}}_{1}({\boldsymbol {F}}_{2}({\boldsymbol {S}}))}
then
∂
F
∂
S
:
T
=
∂
F
1
∂
F
2
:
(
∂
F
2
∂
S
:
T
)
{\displaystyle {\frac {\partial {\boldsymbol {F}}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}={\frac {\partial {\boldsymbol {F}}_{1}}{\partial {\boldsymbol {F}}_{2}}}:\left({\frac {\partial {\boldsymbol {F}}_{2}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}\right)}
3) If
f
(
S
)
=
f
1
(
F
2
(
S
)
)
{\displaystyle f({\boldsymbol {S}})=f_{1}({\boldsymbol {F}}_{2}({\boldsymbol {S}}))}
then
∂
f
∂
S
:
T
=
∂
f
1
∂
F
2
:
(
∂
F
2
∂
S
:
T
)
{\displaystyle {\frac {\partial f}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}={\frac {\partial f_{1}}{\partial {\boldsymbol {F}}_{2}}}:\left({\frac {\partial {\boldsymbol {F}}_{2}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}\right)}
Derivative of the determinant of a tensor
edit
Derivative of the determinant of a tensor
Proof:
Let
A
{\displaystyle {\boldsymbol {A}}}
be a second order tensor and let
f
(
A
)
=
det
(
A
)
{\displaystyle f({\boldsymbol {A}})=\det({\boldsymbol {A}})}
. Then,
from the definition of the derivative of a scalar valued function of a tensor,
we have
∂
f
∂
A
:
T
=
d
d
α
det
(
A
+
α
T
)
|
α
=
0
=
d
d
α
det
[
α
A
(
1
α
1
+
A
−
1
⋅
T
)
]
|
α
=
0
=
d
d
α
[
α
3
det
(
A
)
det
(
1
α
1
+
A
−
1
⋅
T
)
]
|
α
=
0
.
{\displaystyle {\begin{aligned}{\frac {\partial f}{\partial {\boldsymbol {A}}}}:{\boldsymbol {T}}&=\left.{\cfrac {d}{d\alpha }}\det({\boldsymbol {A}}+\alpha ~{\boldsymbol {T}})\right|_{\alpha =0}\\&=\left.{\cfrac {d}{d\alpha }}\det \left[\alpha ~{\boldsymbol {A}}\left({\cfrac {1}{\alpha }}~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}}\right)\right]\right|_{\alpha =0}\\&=\left.{\cfrac {d}{d\alpha }}\left[\alpha ^{3}~\det({\boldsymbol {A}})~\det \left({\cfrac {1}{\alpha }}~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}}\right)\right]\right|_{\alpha =0}~.\end{aligned}}}
Recall that we can expand the determinant of a tensor in the form of
a characteristic equation in terms of the invariants
I
1
,
I
2
,
I
3
{\displaystyle I_{1},I_{2},I_{3}}
using
(note the sign of
λ
{\displaystyle \lambda }
)
det
(
λ
1
+
A
)
=
λ
3
+
I
1
(
A
)
λ
2
+
I
2
(
A
)
λ
+
I
3
(
A
)
.
{\displaystyle \det(\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}})=\lambda ^{3}+I_{1}({\boldsymbol {A}})~\lambda ^{2}+I_{2}({\boldsymbol {A}})~\lambda +I_{3}({\boldsymbol {A}})~.}
Using this expansion we can write
∂
f
∂
A
:
T
=
d
d
α
[
α
3
det
(
A
)
(
1
α
3
+
I
1
(
A
−
1
⋅
T
)
1
α
2
+
I
2
(
A
−
1
⋅
T
)
1
α
+
I
3
(
A
−
1
⋅
T
)
)
]
|
α
=
0
=
det
(
A
)
d
d
α
[
1
+
I
1
(
A
−
1
⋅
T
)
α
+
I
2
(
A
−
1
⋅
T
)
α
2
+
I
3
(
A
−
1
⋅
T
)
α
3
]
|
α
=
0
=
det
(
A
)
[
I
1
(
A
−
1
⋅
T
)
+
2
I
2
(
A
−
1
⋅
T
)
α
+
3
I
3
(
A
−
1
⋅
T
)
α
2
]
|
α
=
0
=
det
(
A
)
I
1
(
A
−
1
⋅
T
)
.
{\displaystyle {\begin{aligned}{\frac {\partial f}{\partial {\boldsymbol {A}}}}:{\boldsymbol {T}}&=\left.{\cfrac {d}{d\alpha }}\left[\alpha ^{3}~\det({\boldsymbol {A}})~\left({\cfrac {1}{\alpha ^{3}}}+I_{1}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})~{\cfrac {1}{\alpha ^{2}}}+I_{2}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})~{\cfrac {1}{\alpha }}+I_{3}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})\right)\right]\right|_{\alpha =0}\\&=\left.\det({\boldsymbol {A}})~{\cfrac {d}{d\alpha }}\left[1+I_{1}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})~\alpha +I_{2}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})~\alpha ^{2}+I_{3}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})~\alpha ^{3}\right]\right|_{\alpha =0}\\&=\left.\det({\boldsymbol {A}})~\left[I_{1}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})+2~I_{2}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})~\alpha +3~I_{3}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})~\alpha ^{2}\right]\right|_{\alpha =0}\\&=\det({\boldsymbol {A}})~I_{1}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})~.\end{aligned}}}
Recall that the invariant
I
1
{\displaystyle I_{1}}
is given by
I
1
(
A
)
=
tr
A
.
{\displaystyle I_{1}({\boldsymbol {A}})={\text{tr}}{\boldsymbol {A}}~.}
Hence,
∂
f
∂
A
:
T
=
det
(
A
)
tr
(
A
−
1
⋅
T
)
=
det
(
A
)
[
A
−
1
]
T
:
T
.
{\displaystyle {\frac {\partial f}{\partial {\boldsymbol {A}}}}:{\boldsymbol {T}}=\det({\boldsymbol {A}})~{\text{tr}}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}})=\det({\boldsymbol {A}})~[{\boldsymbol {A}}^{-1}]^{T}:{\boldsymbol {T}}~.}
Invoking the arbitrariness of
T
{\displaystyle {\boldsymbol {T}}}
we then have
∂
f
∂
A
=
det
(
A
)
[
A
−
1
]
T
.
{\displaystyle {\frac {\partial f}{\partial {\boldsymbol {A}}}}=\det({\boldsymbol {A}})~[{\boldsymbol {A}}^{-1}]^{T}~.}
Derivatives of the invariants of a tensor
edit
Derivatives of the principal invariants of a tensor
The principal invariants of a second order tensor are
I
1
(
A
)
=
tr
A
I
2
(
A
)
=
1
2
[
(
tr
A
)
2
−
tr
A
2
]
I
3
(
A
)
=
det
(
A
)
{\displaystyle {\begin{aligned}I_{1}({\boldsymbol {A}})&={\text{tr}}{\boldsymbol {A}}\\I_{2}({\boldsymbol {A}})&={\frac {1}{2}}\left[({\text{tr}}{\boldsymbol {A}})^{2}-{\text{tr}}{{\boldsymbol {A}}^{2}}\right]\\I_{3}({\boldsymbol {A}})&=\det({\boldsymbol {A}})\end{aligned}}}
The derivatives of these three invariants with respect to
A
{\displaystyle {\boldsymbol {A}}}
are
∂
I
1
∂
A
=
1
∂
I
2
∂
A
=
I
1
1
−
A
T
∂
I
3
∂
A
=
det
(
A
)
[
A
−
1
]
T
=
I
2
1
−
A
T
(
I
1
1
−
A
T
)
=
(
A
2
−
I
1
A
+
I
2
1
)
T
{\displaystyle {\begin{aligned}{\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}&={\boldsymbol {\mathit {1}}}\\{\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}&=I_{1}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}\\{\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}&=\det({\boldsymbol {A}})~[{\boldsymbol {A}}^{-1}]^{T}=I_{2}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}~(I_{1}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T})=({\boldsymbol {A}}^{2}-I_{1}~{\boldsymbol {A}}+I_{2}~{\boldsymbol {\mathit {1}}})^{T}\end{aligned}}}
Proof:
From the derivative of the determinant we know that
∂
I
3
∂
A
=
det
(
A
)
[
A
−
1
]
T
.
{\displaystyle {\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}=\det({\boldsymbol {A}})~[{\boldsymbol {A}}^{-1}]^{T}~.}
For the derivatives of the other two invariants, let us go back to the
characteristic equation
det
(
λ
1
+
A
)
=
λ
3
+
I
1
(
A
)
λ
2
+
I
2
(
A
)
λ
+
I
3
(
A
)
.
{\displaystyle \det(\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}})=\lambda ^{3}+I_{1}({\boldsymbol {A}})~\lambda ^{2}+I_{2}({\boldsymbol {A}})~\lambda +I_{3}({\boldsymbol {A}})~.}
Using the same approach as for the determinant of a tensor, we can show that
∂
∂
A
det
(
λ
1
+
A
)
=
det
(
λ
1
+
A
)
[
(
λ
1
+
A
)
−
1
]
T
.
{\displaystyle {\frac {\partial }{\partial {\boldsymbol {A}}}}\det(\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}})=\det(\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}})~[(\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}})^{-1}]^{T}~.}
Now the left hand side can be expanded as
∂
∂
A
det
(
λ
1
+
A
)
=
∂
∂
A
[
λ
3
+
I
1
(
A
)
λ
2
+
I
2
(
A
)
λ
+
I
3
(
A
)
]
=
∂
I
1
∂
A
λ
2
+
∂
I
2
∂
A
λ
+
∂
I
3
∂
A
.
{\displaystyle {\begin{aligned}{\frac {\partial }{\partial {\boldsymbol {A}}}}\det(\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}})&={\frac {\partial }{\partial {\boldsymbol {A}}}}\left[\lambda ^{3}+I_{1}({\boldsymbol {A}})~\lambda ^{2}+I_{2}({\boldsymbol {A}})~\lambda +I_{3}({\boldsymbol {A}})\right]\\&={\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}~\lambda ^{2}+{\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}~\lambda +{\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}~.\end{aligned}}}
Hence
∂
I
1
∂
A
λ
2
+
∂
I
2
∂
A
λ
+
∂
I
3
∂
A
=
det
(
λ
1
+
A
)
[
(
λ
1
+
A
)
−
1
]
T
{\displaystyle {\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}~\lambda ^{2}+{\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}~\lambda +{\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}=\det(\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}})~[(\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}})^{-1}]^{T}}
or,
(
λ
1
+
A
)
T
⋅
[
∂
I
1
∂
A
λ
2
+
∂
I
2
∂
A
λ
+
∂
I
3
∂
A
]
=
det
(
λ
1
+
A
)
1
.
{\displaystyle (\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}})^{T}\cdot \left[{\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}~\lambda ^{2}+{\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}~\lambda +{\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}\right]=\det(\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}})~{\boldsymbol {\mathit {1}}}~.}
Expanding the right hand side and separating terms on the left hand side
gives
(
λ
1
+
A
T
)
⋅
[
∂
I
1
∂
A
λ
2
+
∂
I
2
∂
A
λ
+
∂
I
3
∂
A
]
=
[
λ
3
+
I
1
λ
2
+
I
2
λ
+
I
3
]
1
{\displaystyle (\lambda ~{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}}^{T})\cdot \left[{\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}~\lambda ^{2}+{\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}~\lambda +{\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}\right]=\left[\lambda ^{3}+I_{1}~\lambda ^{2}+I_{2}~\lambda +I_{3}\right]{\boldsymbol {\mathit {1}}}}
or,
[
∂
I
1
∂
A
λ
3
+
∂
I
2
∂
A
λ
2
+
∂
I
3
∂
A
λ
]
1
+
A
T
⋅
∂
I
1
∂
A
λ
2
+
A
T
⋅
∂
I
2
∂
A
λ
+
A
T
⋅
∂
I
3
∂
A
=
[
λ
3
+
I
1
λ
2
+
I
2
λ
+
I
3
]
1
.
{\displaystyle {\begin{aligned}\left[{\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}~\lambda ^{3}\right.&\left.+{\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}~\lambda ^{2}+{\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}~\lambda \right]{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}~\lambda ^{2}+{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}~\lambda +{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}\\&=\left[\lambda ^{3}+I_{1}~\lambda ^{2}+I_{2}~\lambda +I_{3}\right]{\boldsymbol {\mathit {1}}}~.\end{aligned}}}
If we define
I
0
:=
1
{\displaystyle I_{0}:=1}
and
I
4
:=
0
{\displaystyle I_{4}:=0}
, we can write the above as
[
∂
I
1
∂
A
λ
3
+
∂
I
2
∂
A
λ
2
+
∂
I
3
∂
A
λ
+
∂
I
4
∂
A
]
1
+
A
T
⋅
∂
I
0
∂
A
λ
3
+
A
T
⋅
∂
I
1
∂
A
λ
2
+
A
T
⋅
∂
I
2
∂
A
λ
+
A
T
⋅
∂
I
3
∂
A
=
[
I
0
λ
3
+
I
1
λ
2
+
I
2
λ
+
I
3
]
1
.
{\displaystyle {\begin{aligned}\left[{\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}~\lambda ^{3}\right.&\left.+{\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}~\lambda ^{2}+{\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}~\lambda +{\frac {\partial I_{4}}{\partial {\boldsymbol {A}}}}\right]{\boldsymbol {\mathit {1}}}+{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{0}}{\partial {\boldsymbol {A}}}}~\lambda ^{3}+{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}~\lambda ^{2}+{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}~\lambda +{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}\\&=\left[I_{0}~\lambda ^{3}+I_{1}~\lambda ^{2}+I_{2}~\lambda +I_{3}\right]{\boldsymbol {\mathit {1}}}~.\end{aligned}}}
Collecting terms containing various powers of
λ
{\displaystyle \lambda }
, we get
λ
3
(
I
0
1
−
∂
I
1
∂
A
1
−
A
T
⋅
∂
I
0
∂
A
)
+
λ
2
(
I
1
1
−
∂
I
2
∂
A
1
−
A
T
⋅
∂
I
1
∂
A
)
+
λ
(
I
2
1
−
∂
I
3
∂
A
1
−
A
T
⋅
∂
I
2
∂
A
)
+
(
I
3
1
−
∂
I
4
∂
A
1
−
A
T
⋅
∂
I
3
∂
A
)
=
0
.
{\displaystyle {\begin{aligned}\lambda ^{3}&\left(I_{0}~{\boldsymbol {\mathit {1}}}-{\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{0}}{\partial {\boldsymbol {A}}}}\right)+\lambda ^{2}\left(I_{1}~{\boldsymbol {\mathit {1}}}-{\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}\right)+\\&\qquad \qquad \lambda \left(I_{2}~{\boldsymbol {\mathit {1}}}-{\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}\right)+\left(I_{3}~{\boldsymbol {\mathit {1}}}-{\frac {\partial I_{4}}{\partial {\boldsymbol {A}}}}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}\right)=0~.\end{aligned}}}
Then, invoking the arbitrariness of
λ
{\displaystyle \lambda }
, we have
I
0
1
−
∂
I
1
∂
A
1
−
A
T
⋅
∂
I
0
∂
A
=
0
I
1
1
−
∂
I
2
∂
A
1
−
I
2
1
−
∂
I
3
∂
A
1
−
A
T
⋅
∂
I
2
∂
A
=
0
I
3
1
−
∂
I
4
∂
A
1
−
A
T
⋅
∂
I
3
∂
A
=
0
.
{\displaystyle {\begin{aligned}I_{0}~{\boldsymbol {\mathit {1}}}-{\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{0}}{\partial {\boldsymbol {A}}}}&=0\\I_{1}~{\boldsymbol {\mathit {1}}}-{\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}~{\boldsymbol {\mathit {1}}}-I_{2}~{\boldsymbol {\mathit {1}}}-{\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}&=0\\I_{3}~{\boldsymbol {\mathit {1}}}-{\frac {\partial I_{4}}{\partial {\boldsymbol {A}}}}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}\cdot {\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}&=0~.\end{aligned}}}
This implies that
∂
I
1
∂
A
=
1
∂
I
2
∂
A
=
I
1
1
−
A
T
∂
I
3
∂
A
=
I
2
1
−
A
T
(
I
1
1
−
A
T
)
=
(
A
2
−
I
1
A
+
I
2
1
)
T
{\displaystyle {\begin{aligned}{\frac {\partial I_{1}}{\partial {\boldsymbol {A}}}}&={\boldsymbol {\mathit {1}}}\\{\frac {\partial I_{2}}{\partial {\boldsymbol {A}}}}&=I_{1}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}\\{\frac {\partial I_{3}}{\partial {\boldsymbol {A}}}}&=I_{2}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T}~(I_{1}~{\boldsymbol {\mathit {1}}}-{\boldsymbol {A}}^{T})=({\boldsymbol {A}}^{2}-I_{1}~{\boldsymbol {A}}+I_{2}~{\boldsymbol {\mathit {1}}})^{T}\end{aligned}}}
Derivative of the identity tensor
edit
Let
1
{\displaystyle {\boldsymbol {\mathit {1}}}}
be the second order identity tensor. Then the derivative of this
tensor with respect to a second order tensor
A
{\displaystyle {\boldsymbol {A}}}
is given by
∂
1
∂
A
:
T
=
0
:
T
=
0
{\displaystyle {\frac {\partial {\boldsymbol {\mathit {1}}}}{\partial {\boldsymbol {A}}}}:{\boldsymbol {T}}={\boldsymbol {\mathsf {0}}}:{\boldsymbol {T}}={\boldsymbol {\mathit {0}}}}
This is because
1
{\displaystyle {\boldsymbol {\mathit {1}}}}
is independent of
A
{\displaystyle {\boldsymbol {A}}}
.
Derivative of a tensor with respect to itself
edit
Let
A
{\displaystyle {\boldsymbol {A}}}
be a second order tensor. Then
∂
A
∂
A
:
T
=
[
∂
∂
α
(
A
+
α
T
)
]
α
=
0
=
T
=
I
:
T
{\displaystyle {\frac {\partial {\boldsymbol {A}}}{\partial {\boldsymbol {A}}}}:{\boldsymbol {T}}=\left[{\frac {\partial }{\partial \alpha }}({\boldsymbol {A}}+\alpha ~{\boldsymbol {T}})\right]_{\alpha =0}={\boldsymbol {T}}={\boldsymbol {\mathsf {I}}}:{\boldsymbol {T}}}
Therefore,
∂
A
∂
A
=
I
{\displaystyle {\frac {\partial {\boldsymbol {A}}}{\partial {\boldsymbol {A}}}}={\boldsymbol {\mathsf {I}}}}
Here
I
{\displaystyle {\boldsymbol {\mathsf {I}}}}
is the fourth order identity tensor. In index
notation with respect to an orthonormal basis
I
=
δ
i
k
δ
j
l
e
i
⊗
e
j
⊗
e
k
⊗
e
l
{\displaystyle {\boldsymbol {\mathsf {I}}}=\delta _{ik}~\delta _{jl}~\mathbf {e} _{i}\otimes \mathbf {e} _{j}\otimes \mathbf {e} _{k}\otimes \mathbf {e} _{l}}
This result implies that
∂
A
T
∂
A
:
T
=
I
T
:
T
=
T
T
{\displaystyle {\frac {\partial {\boldsymbol {A}}^{T}}{\partial {\boldsymbol {A}}}}:{\boldsymbol {T}}={\boldsymbol {\mathsf {I}}}^{T}:{\boldsymbol {T}}={\boldsymbol {T}}^{T}}
where
I
T
=
δ
j
k
δ
i
l
e
i
⊗
e
j
⊗
e
k
⊗
e
l
{\displaystyle {\boldsymbol {\mathsf {I}}}^{T}=\delta _{jk}~\delta _{il}~\mathbf {e} _{i}\otimes \mathbf {e} _{j}\otimes \mathbf {e} _{k}\otimes \mathbf {e} _{l}}
Therefore, if the tensor
A
{\displaystyle {\boldsymbol {A}}}
is symmetric, then the derivative is also symmetric and
we get
∂
A
∂
A
=
∂
1
2
(
A
+
A
T
)
∂
A
=
1
2
(
I
+
I
T
)
=
I
(
s
)
{\displaystyle {\frac {\partial {\boldsymbol {A}}}{\partial {\boldsymbol {A}}}}={\frac {\partial {\frac {1}{2}}({\boldsymbol {A}}+{\boldsymbol {A}}^{T})}{\partial {\boldsymbol {A}}}}={\frac {1}{2}}~({\boldsymbol {\mathsf {I}}}+{\boldsymbol {\mathsf {I}}}^{T})={\boldsymbol {\mathsf {I}}}^{(s)}}
where the symmetric fourth order identity tensor is
I
(
s
)
=
1
2
(
δ
i
k
δ
j
l
+
δ
i
l
δ
j
k
)
e
i
⊗
e
j
⊗
e
k
⊗
e
l
{\displaystyle {\boldsymbol {\mathsf {I}}}^{(s)}={\frac {1}{2}}~(\delta _{ik}~\delta _{jl}+\delta _{il}~\delta _{jk})~\mathbf {e} _{i}\otimes \mathbf {e} _{j}\otimes \mathbf {e} _{k}\otimes \mathbf {e} _{l}}
Derivative of the inverse of a tensor
edit
Derivative of the inverse of a tensor
Proof:
Recall that
∂
1
∂
A
:
T
=
0
{\displaystyle {\frac {\partial {\boldsymbol {\mathit {1}}}}{\partial {\boldsymbol {A}}}}:{\boldsymbol {T}}={\boldsymbol {\mathit {0}}}}
Since
A
−
1
⋅
A
=
1
{\displaystyle {\boldsymbol {A}}^{-1}\cdot {\boldsymbol {A}}={\boldsymbol {\mathit {1}}}}
, we can write
∂
∂
A
(
A
−
1
⋅
A
)
:
T
=
0
{\displaystyle {\frac {\partial }{\partial {\boldsymbol {A}}}}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {A}}):{\boldsymbol {T}}={\boldsymbol {\mathit {0}}}}
Using the product rule for second order tensors
∂
∂
S
[
F
1
(
S
)
⋅
F
2
(
S
)
]
:
T
=
(
∂
F
1
∂
S
:
T
)
⋅
F
2
+
F
1
⋅
(
∂
F
2
∂
S
:
T
)
{\displaystyle {\frac {\partial }{\partial {\boldsymbol {S}}}}[{\boldsymbol {F}}_{1}({\boldsymbol {S}})\cdot {\boldsymbol {F}}_{2}({\boldsymbol {S}})]:{\boldsymbol {T}}=\left({\frac {\partial {\boldsymbol {F}}_{1}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}\right)\cdot {\boldsymbol {F}}_{2}+{\boldsymbol {F}}_{1}\cdot \left({\frac {\partial {\boldsymbol {F}}_{2}}{\partial {\boldsymbol {S}}}}:{\boldsymbol {T}}\right)}
we get
∂
∂
A
(
A
−
1
⋅
A
)
:
T
=
(
∂
A
−
1
∂
A
:
T
)
⋅
A
+
A
−
1
⋅
(
∂
A
∂
A
:
T
)
=
0
{\displaystyle {\frac {\partial }{\partial {\boldsymbol {A}}}}({\boldsymbol {A}}^{-1}\cdot {\boldsymbol {A}}):{\boldsymbol {T}}=\left({\frac {\partial {\boldsymbol {A}}^{-1}}{\partial {\boldsymbol {A}}}}:{\boldsymbol {T}}\right)\cdot {\boldsymbol {A}}+{\boldsymbol {A}}^{-1}\cdot \left({\frac {\partial {\boldsymbol {A}}}{\partial {\boldsymbol {A}}}}:{\boldsymbol {T}}\right)={\boldsymbol {\mathit {0}}}}
or,
(
∂
A
−
1
∂
A
:
T
)
⋅
A
=
−
A
−
1
⋅
T
{\displaystyle \left({\frac {\partial {\boldsymbol {A}}^{-1}}{\partial {\boldsymbol {A}}}}:{\boldsymbol {T}}\right)\cdot {\boldsymbol {A}}=-{\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}}}
Therefore,
∂
∂
A
(
A
−
1
)
:
T
=
−
A
−
1
⋅
T
⋅
A
−
1
{\displaystyle {\frac {\partial }{\partial {\boldsymbol {A}}}}\left({\boldsymbol {A}}^{-1}\right):{\boldsymbol {T}}=-{\boldsymbol {A}}^{-1}\cdot {\boldsymbol {T}}\cdot {\boldsymbol {A}}^{-1}}
The boldface notation that I've used is called the Gibbs notation. The index notation that I have used is also called Cartesian tensor notation.