Applied linear operators and spectral methods/Lecture 2

Norms in inner product spaces

Inner product spaces have $L_{p}$ norms which are defined as

\lVert \mathbf {x} \rVert _{p}=\langle \mathbf {x} ,\mathbf {x} \rangle ^{1/p},\quad p=1,2,\dots \infty

When $p=1$ , we get the $L_{1}$ norm

\lVert \mathbf {x} \rVert _{1}=\langle \mathbf {x} ,\mathbf {x} \rangle

When $p=2$ , we get the $L_{2}$ norm

\lVert \mathbf {x} \rVert _{2}={\sqrt {\langle \mathbf {x} ,\mathbf {x} \rangle }}

In the limit as $p\rightarrow \infty$ we get the $L_{\infty }$ norm or the sup norm

\lVert \mathbf {x} \rVert _{\infty }=max|x_{k}|

The adjacent figure shows a geometric interpretation of the three norms.

Geomtric interpretation of various norms

If a vector space has an inner product then the norm

\lVert \mathbf {x} \rVert ={\sqrt {\langle \mathbf {x} ,\mathbf {x} \rangle }}=\lVert \mathbf {x} \rVert _{2}

is called the induced norm. Clearly, the induced norm is nonnegative and zero only if $\mathbf {x} =\mathbf {0}$ . It is also linear under multiplication by a positive vector. You can think of the induced norm as a measure of length for the vector space.

So useful results that follow from the definition of the norm are discussed below.

Schwarz inequality

In an inner product space

|\langle \mathbf {x} ,\mathbf {y} \rangle |\leq \lVert \mathbf {x} \rVert ~\lVert \mathbf {y} \rVert

Proof

This statement is true if $\mathbf {y} =\mathbf {0}$ .

If $\mathbf {y} \neq \mathbf {0}$ we have

0<\lVert \mathbf {x} -\alpha ~\mathbf {y} \rVert ^{2}=\langle (\mathbf {x} -\alpha ~\mathbf {y} ),(\mathbf {x} -\alpha ~\mathbf {y} )\rangle =\langle \mathbf {x} ,\mathbf {x} \rangle -\langle \mathbf {x} ,\alpha ~\mathbf {y} \rangle -\langle \alpha ~\mathbf {y} ,\mathbf {x} \rangle +|\alpha ^{2}|~\langle \mathbf {y} ,\mathbf {y} \rangle

Now

\langle \mathbf {x} ,\alpha ~\mathbf {y} \rangle +\langle \alpha ~\mathbf {y} ,\mathbf {x} \rangle ={\overline {\alpha }}~\langle \mathbf {x} ,\mathbf {y} \rangle +\alpha ~\langle \mathbf {x} ,\mathbf {y} \rangle =2~{\text{Re}}(\alpha )~\langle \mathbf {x} ,\mathbf {y} \rangle

Therefore,

\lVert \mathbf {x} \rVert ^{2}-2~{\text{Re}}(\alpha )\langle \mathbf {x} ,\mathbf {y} \rangle +|\alpha ^{2}|~\lVert \mathbf {y} \rVert ^{2}>0

Let us choose $\alpha$ such that it minimizes the left hand side above. This value is clearly

\alpha ={\cfrac {\langle \mathbf {x} ,\mathbf {y} \rangle }{\lVert \mathbf {y} \rVert ^{2}}}

which gives us

\lVert \mathbf {x} \rVert ^{2}-2~{\cfrac {|\langle \mathbf {x} ,\mathbf {y} \rangle |^{2}}{\lVert \mathbf {y} \rVert ^{2}}}+{\cfrac {|\langle \mathbf {x} ,\mathbf {y} \rangle |^{2}}{\lVert \mathbf {y} \rVert ^{2}}}>0

Therefore,

\lVert \mathbf {x} \rVert ^{2}~\lVert \mathbf {y} \rVert ^{2}\geq |\langle \mathbf {x} ,\mathbf {y} \rangle |^{2}\qquad \square

Triangle inequality

The triangle inequality states that

\lVert \mathbf {x} +\mathbf {y} \rVert \leq \lVert \mathbf {x} \rVert +\lVert \mathbf {y} \rVert

Proof

\lVert \mathbf {x} +\mathbf {y} \rVert ^{2}=\lVert \mathbf {x} \rVert ^{2}+2{\text{Re}}\langle \mathbf {x} ,\mathbf {y} \rangle +\lVert \mathbf {y} \rVert ^{2}

From the Schwarz inequality

\lVert \mathbf {x} +\mathbf {y} \rVert ^{2}<\lVert \mathbf {x} \rVert ^{2}+2\lVert \mathbf {x} \rVert \lVert \mathbf {y} \rVert +\lVert \mathbf {y} \rVert ^{2}=(\lVert \mathbf {x} \rVert +\lVert \mathbf {y} \rVert )^{2}

Hence

\lVert \mathbf {x} +\mathbf {y} \rVert \leq \lVert \mathbf {x} \rVert +\lVert \mathbf {y} \rVert \qquad \square

Angle between two vectors

In $\mathbb {R} ^{2}$ or $\mathbb {R} ^{3}$ we have

\cos \theta ={\cfrac {\langle \mathbf {x} ,\mathbf {y} \rangle }{\lVert \mathbf {x} \rVert \lVert \mathbf {y} \rVert }}

So it makes sense to define $\cos \theta$ in this way for any real vector space.

We then have

\lVert \mathbf {x} +\mathbf {y} \rVert ^{2}=\lVert \mathbf {x} \rVert ^{2}+2~\lVert \mathbf {x} \rVert ~\lVert \mathbf {y} \rVert \cos \theta +\lVert \mathbf {y} \rVert ^{2}

Orthogonality

In particular, if $\cos \theta =0$ we have an analog of the Pythagoras theorem.

\lVert \mathbf {x} +\mathbf {y} \rVert ^{2}=\lVert \mathbf {x} \rVert ^{2}+\lVert \mathbf {y} \rVert ^{2}

In that case the vectors are said to be orthogonal.

If $\langle \mathbf {x} ,\mathbf {y} \rangle =0$ then the vectors are said to be orthogonal even in a complex vector space.

Orthogonal vectors have a lot of nice properties.

Linear independence of orthogonal vectors

A set of nonzero orthogonal vectors is linearly independent.

If the vectors ${\boldsymbol {\varphi }}_{i}$ are linearly dependent

\alpha _{1}~{\boldsymbol {\varphi }}_{1}+\alpha _{2}~{\boldsymbol {\varphi }}_{2}+\dots +\alpha _{n}~{\boldsymbol {\varphi }}_{n}=0

and the ${\boldsymbol {\varphi }}_{i}$ are orthogonal, then taking an inner product with ${\boldsymbol {\varphi }}_{j}$ gives

\alpha _{j}~\langle {\boldsymbol {\varphi }}_{j},{\boldsymbol {\varphi }}_{j}\rangle =0\quad \implies \quad \alpha _{j}=0~\forall j

since

\langle {\boldsymbol {\varphi }}_{i},{\boldsymbol {\varphi }}_{j}\rangle =0\quad {\text{if}}~i\neq j~.

Therefore the only nontrivial case is that the vectors are linearly independent.

Expressing a vector in terms of an orthogonal basis

If we have a basis $\{{\boldsymbol {\varphi }}_{1},{\boldsymbol {\varphi }}_{2},\dots ,{\boldsymbol {\varphi }}_{n}\}$ and wish to express a vector $\mathbf {f}$ in terms of it we have

\mathbf {f} =\sum _{j=1}^{n}\beta _{j}~{\boldsymbol {\varphi }}_{j}

The problem is to find the $\beta _{j}$ s.

If we take the inner product with respect to ${\boldsymbol {\varphi }}_{i}$ , we get

\langle \mathbf {f} ,{\boldsymbol {\varphi }}_{i}\rangle =\sum _{j=1}^{n}\beta _{j}~\langle {\boldsymbol {\varphi }}_{i},{\boldsymbol {\varphi }}_{j}\rangle

In matrix form,

{\boldsymbol {\eta }}={\boldsymbol {B}}~{\boldsymbol {\beta }}

where $B_{ij}=\langle {\boldsymbol {\varphi }}_{i},{\boldsymbol {\varphi }}_{j}\rangle$ and $\eta _{i}=\langle \mathbf {f} ,{\boldsymbol {\varphi }}_{i}\rangle$ .

Generally, getting the $\beta _{j}$ s involves inverting the $n\times n$ matrix ${\boldsymbol {B}}$ , which is an identity matrix ${\boldsymbol {I_{n}}}$ , because $\langle {\boldsymbol {\varphi }}_{i},{\boldsymbol {\varphi }}_{j}\rangle ={\boldsymbol {\delta }}_{ij}$ , where ${\boldsymbol {\delta }}_{ij}$ is the Kronecker delta.

Provided that the ${\boldsymbol {\varphi }}_{i}$ s are orthogonal then we have

\beta _{j}={\cfrac {\langle \mathbf {f} ,{\boldsymbol {\varphi }}_{j}\rangle }{\lVert {\boldsymbol {\varphi }}_{j}\rVert ^{2}}}

and the quantity

\mathbf {p} ={\cfrac {\langle \mathbf {f} ,{\boldsymbol {\varphi }}_{j}\rangle }{\lVert {\boldsymbol {\varphi }}_{j}\rVert ^{2}}}~{\boldsymbol {\varphi }}_{j}

is called the projection of $\mathbf {f}$ onto ${\boldsymbol {\varphi }}_{j}$ .

Therefore the sum

$\mathbf {f} =\sum _{j}\beta _{j}~{\boldsymbol {\varphi }}_{j}$

says that $\mathbf {f}$ is just a sum of its projections onto the orthogonal basis.

Projection operation.

Let us check whether $\mathbf {p}$ is actually a projection. Let

\mathbf {a} =\mathbf {f} -\mathbf {p} =\mathbf {f} -{\cfrac {\langle \mathbf {f} ,{\boldsymbol {\varphi }}\rangle }{\lVert {\boldsymbol {\varphi }}\rVert ^{2}}}~{\boldsymbol {\varphi }}

Then,

\langle \mathbf {a} ,{\boldsymbol {\varphi }}\rangle =\langle \mathbf {f} ,{\boldsymbol {\varphi }}\rangle -{\cfrac {\langle \mathbf {f} ,{\boldsymbol {\varphi }}\rangle }{\lVert {\boldsymbol {\varphi }}\rVert ^{2}}}~\langle {\boldsymbol {\varphi }},{\boldsymbol {\varphi }}\rangle =0

Therefore $\mathbf {a}$ and ${\boldsymbol {\varphi }}$ are indeed orthogonal.

Note that we can normalize ${\boldsymbol {\varphi }}_{i}$ by defining

{\tilde {\boldsymbol {\varphi }}}_{i}={\cfrac {{\boldsymbol {\varphi }}_{i}}{\lVert {\boldsymbol {\varphi }}_{i}\rVert }}

Then the basis $\{{\tilde {\boldsymbol {\varphi }}}_{1},{\tilde {\boldsymbol {\varphi }}}_{2},\dots ,{\tilde {\boldsymbol {\varphi }}}_{n}\}$ is called an orthonormal basis.

It follows from the equation for $\beta _{j}$ that

{\tilde {\beta }}_{j}=\langle \mathbf {f} ,{\tilde {{\boldsymbol {\varphi }}\rangle _{j}}}

and

\mathbf {f} =\sum _{j=1}^{n}{\tilde {\beta }}_{j}~{\tilde {\boldsymbol {\varphi }}}_{j}

You can think of the vectors ${\tilde {\boldsymbol {\varphi }}}_{i}$ as orthogonal unit vectors in an $n$ -dimensional space.

Biorthogonal basis

However, using an orthogonal basis is not the only way to do things. An alternative that is useful (for instance when using wavelets) is the biorthonormal basis.

The problem in this case is converted into one where, given any basis $\{{\boldsymbol {\varphi }}_{1},{\boldsymbol {\varphi }}_{2},\dots ,{\boldsymbol {\varphi }}_{n}\}$ , we want to find another set of vectors $\{{\boldsymbol {\psi }}_{1},{\boldsymbol {\psi }}_{2},\dots ,{\boldsymbol {\psi }}_{n}\}$ such that

\langle {\boldsymbol {\varphi }}_{i},{\boldsymbol {\psi }}_{j}\rangle =\delta _{ij}

In that case, if

\mathbf {f} =\sum _{j=1}^{n}\beta _{j}~{\boldsymbol {\varphi }}_{j}

it follows that

\langle \mathbf {f} ,{\boldsymbol {\psi }}_{k}\rangle =\sum _{j=1}^{n}\beta _{j}~\langle {\boldsymbol {\varphi }}_{j},{\boldsymbol {\psi }}_{k}\rangle =\beta _{k}

So the coefficients $\beta _{k}$ can easily be recovered. You can see a schematic of the two sets of vectors in the adjacent figure.

Biorthonomal basis

Gram-Schmidt orthogonalization

One technique for getting an orthogonal baisis is to use the process of Gram-Schmidt orthogonalization.

The goal is to produce an orthogonal set of vectors $\{{\boldsymbol {\varphi }}_{1},{\boldsymbol {\varphi }}_{2},\dots ,{\boldsymbol {\varphi }}_{n}\}$ given a linearly independent set $\{\mathbf {x} _{1},\mathbf {x} _{2},\dots ,\mathbf {x} _{n}\}$ .

We start of by assuming that ${\boldsymbol {\varphi }}_{1}=\mathbf {x} _{1}$ . Then ${\boldsymbol {\varphi }}_{2}$ is given by subtracting the projection of $\mathbf {x} _{2}$ onto ${\boldsymbol {\varphi }}_{1}$ from $\mathbf {x} _{2}$ , i.e.,

{\boldsymbol {\varphi }}_{2}=\mathbf {x} _{2}-{\cfrac {\langle \mathbf {x} _{2},{\boldsymbol {\varphi }}_{1}\rangle }{\lVert {\boldsymbol {\varphi }}_{1}\rVert ^{2}}}~{\boldsymbol {\varphi }}_{1}

Thus ${\boldsymbol {\varphi }}_{2}$ is clearly orthogonal to ${\boldsymbol {\varphi }}_{1}$ . For ${\boldsymbol {\varphi }}_{3}$ we use

{\boldsymbol {\varphi }}_{3}=\mathbf {x} _{3}-{\cfrac {\langle \mathbf {x} _{3},{\boldsymbol {\varphi }}_{1}\rangle }{\lVert {\boldsymbol {\varphi }}_{1}\rVert ^{2}}}~{\boldsymbol {\varphi }}_{1}-{\cfrac {\langle \mathbf {x} _{3},{\boldsymbol {\varphi }}_{2}\rangle }{\lVert {\boldsymbol {\varphi }}_{2}\rVert ^{2}}}~{\boldsymbol {\varphi }}_{2}

More generally,

{\boldsymbol {\varphi }}_{n}=\mathbf {x} _{n}-\sum _{j=1}^{n-1}{\cfrac {\langle \mathbf {x} _{n},{\boldsymbol {\varphi }}_{j}\rangle }{\lVert {\boldsymbol {\varphi }}_{j}\rVert ^{2}}}~{\boldsymbol {\varphi }}_{j}

If you want an orthonormal set then you can do that by normalizing the orthogonal set of vectors.

We can check that the vectors ${\boldsymbol {\varphi }}_{j}$ are indeed orthogonal by induction. Assume that all ${\boldsymbol {\varphi }}_{j},~j\leq n-1$ are orthogonal for some $j$ . Pick $k<n$ . Then

\langle {\boldsymbol {\varphi }}_{n},{\boldsymbol {\varphi }}_{k}\rangle =\langle \mathbf {x} _{n},{\boldsymbol {\varphi }}_{k}\rangle -\sum _{j=1}^{n-1}{\cfrac {\langle \mathbf {x} _{n},{\boldsymbol {\varphi }}_{j}\rangle }{\lVert {\boldsymbol {\varphi }}_{j}\rVert ^{2}}}~\langle {\boldsymbol {\varphi }}_{j},{\boldsymbol {\varphi }}_{k}\rangle

Now $\langle {\boldsymbol {\varphi }}_{j},{\boldsymbol {\varphi }}_{k}\rangle =0$ unless $j=k$ . However, at $j=k$ , $\langle {\boldsymbol {\varphi }}_{n},{\boldsymbol {\varphi }}_{k}\rangle =0$ because the two remaining terms cancel out. Hence the vectors are orthogonal.

Note that you have to be careful while numerically computing an orthogonal basis using the Gram-Schmidt technique because the errors add up in the terms under the sum.

Linear operators

The object ${\boldsymbol {A}}$ is a linear operator from ${\mathcal {S}}$ onto ${\mathcal {S}}$ if

{\boldsymbol {A}}~\mathbf {x} \equiv {\boldsymbol {A}}(\mathbf {x} )\in {\mathcal {S}}\quad \forall ~\mathbf {x} \in {\mathcal {S}}

A linear operator satisfies the properties

${\boldsymbol {A}}~(\alpha ~\mathbf {x} )=\alpha ~{\boldsymbol {A}}(\mathbf {x} )$ .
${\boldsymbol {A}}~(\mathbf {x} +\mathbf {y} )={\boldsymbol {A}}(\mathbf {x} )+{\boldsymbol {A}}(\mathbf {y} )$ .

Note that ${\boldsymbol {A}}$ is independent of basis. However, the action of ${\boldsymbol {A}}$ on a basis $\{{\boldsymbol {\varphi }}_{1},{\boldsymbol {\varphi }}_{2},\dots ,{\boldsymbol {\varphi }}_{n}\}$ determines ${\boldsymbol {A}}$ completely since

{\boldsymbol {A}}~\mathbf {f} ={\boldsymbol {A}}~\left(\sum _{j}\beta _{j}~{\boldsymbol {\varphi }}_{j}\right)=\sum _{j}\beta _{j}~{\boldsymbol {A}}({\boldsymbol {\varphi }}_{j})

Since ${\boldsymbol {A}}~{\boldsymbol {\varphi }}_{j}\in {\mathcal {S}}$ we can write

{\boldsymbol {A}}~{\boldsymbol {\varphi }}_{j}=\sum _{i}A_{ij}~\varphi _{i}

where $A_{ij}$ is the $n\times n$ matrix representing the operator ${\boldsymbol {A}}$ in the basis $\{{\boldsymbol {\varphi }}_{1},{\boldsymbol {\varphi }}_{2},\dots ,{\boldsymbol {\varphi }}_{n}\}$ .

Note the location of the indices here which is not the same as what we get in matrix multiplication. For example, in ${\text{Re}}^{2}$ , we have

{\boldsymbol {A}}~\mathbf {e} _{2}={\begin{bmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\end{bmatrix}}{\begin{bmatrix}0\\1\end{bmatrix}}={\begin{bmatrix}A_{12}\\A_{22}\end{bmatrix}}=A_{12}~{\begin{bmatrix}1\\0\end{bmatrix}}+A_{22}~{\begin{bmatrix}0\\1\end{bmatrix}}=A_{12}~\mathbf {e} _{1}+A_{22}~\mathbf {e} _{2}=A_{ij}~\mathbf {e} _{i}

We will get into more details in the next lecture.

Resource type: this resource contains a lecture or lecture notes.