Linear algebra/Orthogonal matrix

Alternative notations
$Q^{-1}=Q^{\mathrm {T} }$	$Q^{\mathrm {T} }Q=QQ^{\mathrm {T} }=I$
${\underline {\underline {Q}}}^{-1}={\underline {\underline {Q}}}^{\mathrm {T} }$	${\underline {\underline {Q}}}^{\mathrm {T} }\cdot {\underline {\underline {Q}}}^{-1}={\underline {\underline {Q}}}^{-1}\cdot {\underline {\underline {Q}}}^{\mathrm {T} }={\underline {\underline {I}}}$
$Q_{jk}^{-1}=Q_{kj}\equiv Q_{jk}^{\mathrm {T} }$	$\sum _{k}Q_{ki}Q_{kj}^{-1}=\sum _{k}Q_{ik}^{-1}Q_{jk}=\delta _{ij}$

This article contains excerpts from Wikipedia's Orthogonal matrix.
If complex numbers are involved, see Unitary matrix.

A real square matrix is orthogonal (orthogonal^[1]) if and only if its columns form an orthonormal basis in a Euclidean space in which all numbers are real-valued and dot product is defined in the usual fashion.^[2]^[3] An orthonormal basis in an N dimensional space is one where, (1) all the basis vectors have unit magnitude.^[4]

Fundamental properties

A real square matrix is orthogonal if and only if its columns form an orthonormal basis on the Euclidean space $ℝ n$ , which is the case if and only if its rows form an orthonormal basis of $ℝ n$ .^[5]
The determinant of any orthogonal matrix is +1 or −1. But the converse is not true; having a determinant of ±1 is no guarantee of orthogonality. An orthogonal matrix with a determinant equal to +1 is called a special orthogonal matrix.
An orthogonal matrix can always be diagonalized over the complex numbers to exhibit a full set of eigenvalues, all of which must have (complex) modulus 1.
All permutation matrices are orthogonal (but the converse is not true.)^[6]
All orthogonal matrices are unitary (but the converse is not true.)^[7]
Under the operation of multiplication, the $n \times n$ orthogonal matrices form the orthogonal group known as $O(n)$ .

Three important results that are easy to prove

Among the first things a novice should learn are those that are easy to prove.

Orthonormal basis vectors are hiding in plain sight

Theorem:

If the rows of a square matrix form an orthonormal set of (basis) vectors,

then the transpose of that matrix is its own inverse

(\mathbf {M} ^{T}=\mathbf {M} ^{-1})

Visual understanding

Suppose the rows of a matrix form an orthonormal set of basis vectors, as shown in the i-th row in matrix A to the right. The ij-the element of the product AB takes the dot product of the i-th row of A with the j-th column of matrix B, as shown in the upper part of the diagram. in the diagram's upper part, the j-th column is higlighted in yellow. In the diagram's lower part, matrix B is replaced by it's transpose, which shifts the elements in column j to a row (highlighted in cyan.) This establishes that the product of A with the transpose of B creates elements that are the dot product of rows of A with rows of B.

If A is a orthogonal matrix and B is its transpose, this procedure creates matrix elements that are dot products among the rows of the orthogonal matrix.

Rigorous proof

This proof illustrates how subscripts are used to manipulate and understand tensors.

1. Suppose

\mathbf {v_{i}} =\Sigma _{j}v_{j}\,\mathbf {{\hat {e}}_{j}} =\Sigma _{j}M_{ij}\,\mathbf {{\hat {e}}_{j}}

is the i-th element of a orthonormal set of basis vectors.

Here

\mathbf {{\hat {e}}_{j}}

are the original unit vectors used to define the new set of unit vectors that extract from the rows of matrix

\mathbf {\underline {\underline {M}}}

2. Now we relabel how we write the sums for $\mathbf {v_{i}}$ and $\mathbf {v_{j}}$ as follows:

\mathbf {v_{i}} =\Sigma _{\alpha }M_{i\alpha }\,\mathbf {{\hat {e}}_{\alpha }}

\mathbf {v_{j}} =\Sigma _{\beta }M_{j\beta }\,\mathbf {{\hat {e}}_{\beta }}

Hint: In the first of these two equations, I replace

j

by

\alpha

because summed variables can be changed at will. Sometimes they are called "dummy variables" because they "do not speak" after the sum is done. For example, summing n from 1 to 3 equals 1+2+3, which is the same as summing m from 1 to 3. In the second one I relabeled my dummy variable as

\beta

because the same dummy variable cannot serve two purposes in a single expression.

3. This yields the following expression for the dot product between our two vectors:

\mathbf {v_{i}} \cdot \mathbf {v_{j}} \cdot =\left(\sum _{\alpha }M_{i\alpha }\,\mathbf {{\hat {e}}_{\alpha }} \right)\cdot \left(\sum _{\beta }M_{j\beta }\,\mathbf {{\hat {e}}_{\beta }} \right)

\mathbf {v_{i}} \cdot \mathbf {v_{j}} \cdot =\left(\sum _{\alpha }M_{i\alpha }\,\mathbf {{\hat {e}}_{\alpha }} \right)\cdot \left(\sum _{\beta }M_{j\beta }\,\mathbf {{\hat {e}}_{\beta }} \right)=\sum _{\alpha \beta }\,(\mathbf {{\hat {e}}_{\alpha }} \,\cdot \,\mathbf {{\hat {e}}_{\beta }} )\,M_{i\alpha }M_{j\beta }\,

4. This last term introduces the Kronecker delta symbol:

\mathbf {v_{i}} \cdot \mathbf {v_{j}} \cdot =\left(\sum _{\alpha }M_{i\alpha }\,\mathbf {{\hat {e}}_{\alpha }} \right)\cdot \left(\sum _{\beta }M_{j\beta }\,\mathbf {{\hat {e}}_{\beta }} \right)=\sum _{\alpha \beta }M_{i\alpha }M_{j\beta }\,\underbrace {\mathbf {{\hat {e}}_{\alpha }} \,\cdot \,\mathbf {{\hat {e}}_{\beta }} } _{\delta _{\alpha \beta }}=\sum _{\alpha }M_{i\alpha }M_{j\alpha }\,

The last term almost looks like the product of the matrix with itself. It can be turned into a product using the transpose on the second term in the product, using $M_{j\alpha }=M_{\alpha j}^{T}.$

5. If ${\underline {\underline {M}}}$ is orthogonal, then $\,{\underline {\underline {M}}}^{T}={\underline {\underline {M}}}^{-1},$ and we conclude that the rows of $\,{\underline {\underline {M}}}$ (i.e., the vectors $\mathbf {v_{i}} )$ form an orthonormal collection of vectors (i.e. a "rotated" basis for the vector space.)

\mathbf {v_{i}} \cdot \mathbf {v_{j}} \cdot =\left(\sum _{\alpha }M_{i\alpha }\,\mathbf {{\hat {e}}_{\alpha }} \right)\cdot \left(\sum _{\beta }M_{j\beta }\,\mathbf {{\hat {e}}_{\beta }} \right)=\sum _{\alpha }M_{i\alpha }M_{j\alpha }\,=\sum _{\alpha }M_{i\alpha }M_{\alpha j}^{T}\,=\sum _{\alpha }M_{i\alpha }M_{\alpha j}^{-1}\,=\delta _{ij}

Change of basis for tensors

If a matrix is used to rotate vectors, then use it twice to rotate tensors

A common use of the orthogonal matrix is to express a vector in one reference frame into a "rotated"^[8] frame.

Here, we let ${\underline {\underline {M}}}$ denote any matrix (i.e. "tensor"), while ${\underline {\underline {R}}}$ is any orthogonal matrix (typically a rotation.) Let ${\underline {v}}$ and ${\underline {p}}$ be two vectors, and let ${\underline {v}}'$ and ${\underline {p}}'$ represent the same vectors in a rotated reference frame.

Theorem

If ${\underline {v}}'={\underline {\underline {R}}}\cdot {\underline {v}}$ , then: ${\underline {\underline {M}}}'={\underline {\underline {R}}}\cdot {\underline {\underline {M}}}\cdot {\underline {\underline {R}}}^{-1}$

Proof

Define ${\underline {p}}={\underline {\underline {M}}}\cdot {\underline {v}}.$
Assume ${\underline {v}}'={\underline {\underline {R}}}\cdot {\underline {v}}$ and ${\underline {p}}'={\underline {\underline {R}}}\cdot {\underline {p}}.$
Do some tensor algebra and express ${\underline {p}}'$ in terms of ${\underline {v}}'.$

In this context, the only difference between the tensor and scalar algebras is that with tensors, vector's do not always commute: ${\underline {\underline {A}}}\cdot {\underline {\underline {B}}}-{\underline {\underline {B}}}\cdot {\underline {\underline {A}}}$ does not always vanish.

Derivation of the rotation tensor

Rotation of basis vectors. Since it is an active transformation this sign on

\theta

is opposite to the the case for rotating a point.

This image illustrates a proof for a passive transformation, based on the rules for the sine and cosine of the sum of two angles.

The rotation matrix usually the first orthogonal matrix students encounter. While it is conceptually easier to rotate vectors than to rotate a coordinate system, it is algebraically easier to rotate a coordinate system. From the figure, the unit vectors in a rotated reference frame obey:

${\begin{aligned}{\hat {x}}=&{\hat {x}}'\cos \theta -{\hat {y}}'\sin \theta \qquad &{\hat {y}}=&{\hat {x}}'\sin \theta +{\hat {y}}'\cos \theta \end{aligned}}$

Students will quickly see the sine and cosine components in this equation, but the minus sign might seem confusing. It comes from the fact that ${\hat {x}}$ has a negative component when projected along the ${\hat {y}}'$ direction. Now express the vector ${\underline {V}}$ , first in the unprimed coordinate system, then in primed:

${\underline {V}}=V_{x}{\hat {x}}+{\underline {V}}_{y}{\hat {y}}$

To complete the proof, substitute the expressions that expressed the $({\hat {x}},{\hat {y}})$ unit vectors in terms of the $({\hat {x}}',{\hat {y}}')$ unit vectors:

${\underline {V}}=V_{x}\overbrace {\left({\hat {x}}'\cos \theta -{\hat {y}}'\sin \theta \right)} ^{\hat {x}}+V_{y}\overbrace {\left({\hat {x}}'\sin \theta +{\hat {y}}'\cos \theta \right)} ^{\hat {y}}$ ${\underline {V}}=+V_{x}{\hat {x}}'\cos \theta -V_{x}{\hat {y}}'\sin \theta +V_{y}{\hat {x}}'\sin \theta +V_{y}{\hat {y}}'\cos \theta$

${\underline {V}}=+\underbrace {\left(V_{x}\cos \theta +V_{y}\sin \theta \right)} _{V_{x}'}{\hat {x}}'+\underbrace {\left(-V_{x}\sin \theta +V_{y}\cos \theta \right)} _{V_{x}'}{\hat {y}}'$

This latter expression solves our problem, as we were seeking an expression of the form, ${\underline {V}}=V_{x}'{\hat {x}}'+V_{y}'{\hat {y}}'.$

Note how in this formalism, there is no distinction between the primed and unprimed vector ${\underline {V}}\,.$ This tends to confuse everyone, including the author. Such confusion can be avoided when writing a textbook or article. But in the free-wheeling world of both scientific literature, as well as wikis, such chaos cannot be avoided. That's why it is good to carefully read books.

Going back to the notation of many WMF pages, we have the following formula for the components of a vector if the coordinate system is rotated by $\theta$ about the z axis: ${\begin{bmatrix}V'_{x}\\V'_{y}\end{bmatrix}}={\begin{bmatrix}\cos \theta &\sin \theta \\-\sin \theta &\cos \theta \end{bmatrix}}\,{\begin{bmatrix}V_{x}\\V_{y}\end{bmatrix}}$

Notes

↑ The term "orthogonal" is confusing. A better word in this context would be orthonormal. See the lede sentence in w:special:Permalink/1181197344
↑ w:Special:Permalink/1181197344#Matrix_properties
↑ The physics student's first alternative to the "usual fashion" is the dot product in special relativity, where $\mathbf {\ell } \cdot \mathbf {\ell } '=xx'+yy'+zz'=c^{2}tt'$
↑ "Unit magnitude" means the dot product of the vector with itself equals 1
↑ Most of this page is based on https://en.wikipedia.org/w/index.php?title=Orthogonal_matrix&oldid=1028769520
↑ https://en.wikipedia.org/w/index.php?title=Permutation_matrix&oldid=1015641816#Properties
↑ https://en.wikipedia.org/w/index.php?title=Permutation_matrix&oldid=1015641816#Properties
↑ The quotation marks on "rotation' are intended to include orthogonal matrices that are also reflections of an axis through the origin.

[1] The term "orthogonal" is confusing. A better word in this context would be orthonormal. See the lede sentence in w:special:Permalink/1181197344

[2] w:Special:Permalink/1181197344#Matrix_properties

[3] The physics student's first alternative to the "usual fashion" is the dot product in special relativity, where $\mathbf {\ell } \cdot \mathbf {\ell } '=xx'+yy'+zz'=c^{2}tt'$

[4] "Unit magnitude" means the dot product of the vector with itself equals 1

[:0-5] Most of this page is based on https://en.wikipedia.org/w/index.php?title=Orthogonal_matrix&oldid=1028769520

[6] ttps://en.wikipedia.org/w/index.php?title=Permutation_matrix&oldid=1015641816#Properties

[7] ttps://en.wikipedia.org/w/index.php?title=Permutation_matrix&oldid=1015641816#Properties

[8] The quotation marks on "rotation' are intended to include orthogonal matrices that are also reflections of an axis through the origin.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]