Measure Theory/Convexity and the Product-to-Sum Bound

Convexity

Recall from Lesson 0: Generalizing to L^p that we are interested in convexity because it may help us to find a bound for the product of any two real numbers. In particular, we would like to prove the inequality

w_{1}\ln a+w_{2}\ln b\leq \ln(w_{1}a+w_{2}b)

for arbitrary

a,b\in {\mathbb {R}}^{+}

and weights

w_{1},w_{2}\in {\mathbb {R}}^{+}

such that

w_{1}+w_{2}=1

Why is this inequality an instance of convexity? It will help to think about convexity in the context of an abstract vector space.

Intuitively, convexity means that for any two points in the shape, the line-segment between them lies inside the shape. For example, any triangle is always convex, a circle is convex, but there does exist a rectangle which is not convex. A shape that is not convex is called concave.

Exercise 1. Draw a Convex Shape

Draw a convex shape, just with the intuitive definition.

More rigorously, let's consider any abstract vector space V. Naturally you want to primarily imagine ${\mathbb {R}}^{n}$ for $n=1,2,3$ .

Definition: line segment, convex shape

Let V be any vector space and ${\vec {v}},{\vec {w}}\in V$ . We define the line segment between ${\vec {v}}$ and ${\vec {w}}$ by the set of points

L_{\vec {v}}^{\vec {w}}=\{t{\vec {v}}+(1-t){\vec {w}}:t\in [0,1]\}

We say that any subset $S\subseteq V$ is convex if for every two points ${\vec {v}},{\vec {w}}\in S$ the line segment between them is contained in S.

L_{\vec {v}}^{\vec {w}}\subseteq S

for all

{\vec {v}},{\vec {w}}\in S

Exercise 2. Prove a Convex Shape

In this exercise, consider the vector spaces ${\mathbb {R}}^{n}$ for positive natural numbers n.

Prove that the open unit circle is convex (i.e. $\{{\vec {v}}:\|{\vec {v}}\|<1\}$ ).

Also prove convex shapes are closed under positive scalar multiplication and translation. That is to say, if $c\in {\mathbb {R}}^{+}$ and $S\subseteq V$ is convex, then

cS=\{c{\vec {v}}:{\vec {v}}\in S\}

is convex. And if ${\vec {w}}\in {\mathbb {R}}^{n}$ then

{\vec {w}}+S=\{{\vec {w}}+{\vec {v}}:{\vec {v}}\in S\}

is convex.

Now trying to relate this all back to our mission of proving the inequality at the start, we want to have a notion of the convexity of a function. We then home that this notion relates to the logarithm and the inequality at the beginning.

Therefore the following definition seems reasonable:

Definition: epigraph, convex function

Let $f:I\to {\mathbb {R}}$ be a function on an interval of real numbers $I\subseteq {\mathbb {R}}$ .

Define the epigraph of f by the set of all points above the graph of f.

{\text{epi}}(f)=\{(x,y):x\in I{\text{ and }}f(x)\leq y\}

We say that f is convex if its epigraph is convex.

One may wonder, as I did, why we define a function to be convex if its epigraph is convex, rather than, say, its "hypograph" (i.e. the set of points below the graph)? As far as I can tell this is a mere convention -- a coin was tossed and I guess we live with the legacy of defining convexity of functions by their epigraph rather than hypograph.

Nothing much hinges on this. Although the logarithm is not convex, its negative is convex, i.e. $-\ln x$ is a convex function, and that is enough to use the results that we will find about convex functions.

Exercise 3. Convex only at the Boundary

Show that a function's convexity is determined by its graph, in the sense that one does not need to consider any points not on the graph. More precisely, a function is convex if and only if for any points on the graph, the segment between them is contained in its epigraph.

Even more rigorously, let $f:I\to {\mathbb {R}}$ be a real function on an interval I. Also let $a,b\in I$ . Note that any point on the segment between $(a,f(a))$ and $(b,f(b))$ is identified with a value of $t\in [0,1]$ such that the point is given by

t{\begin{bmatrix}a\\f(a)\end{bmatrix}}+(1-t){\begin{bmatrix}b\\f(b)\end{bmatrix}}

(You will need to identify points with vectors, which is no big whoop, you just understand that $(a,f(a))$ corresponds to the vector ${\begin{bmatrix}a\\f(a)\end{bmatrix}}$ .)

For the function to be convex, the point must be contained in the epigraph. For the point to be contained in the epigraph, this is the same as

\overbrace {f(ta+(1-t)b)} ^{\text{on the graph}}\leq \overbrace {tf(a)+(1-t)f(b)} ^{\text{on the segment}}

So what you must show is that f is convex if and only if, for all $a,b\in I$ and $t\in [0,1]$ ,

f(ta+(1-t)b)\leq tf(a)+(1-t)f(b)

Exercise 4. Convexity and Differentiation

Suppose that f is twice differentiable on an interval I. Now show that f is convex if and only if the second derivative is positive throughout I.

Infer that $x^{2}$ and $-\ln x$ are both convex.

Product-to-Sum Bound

Finally we are in a position to deliver on what put us down this path so long ago!

Exercise 5. The Weighted AM-GM Proof

Infer

w_{1}\ln a+w_{2}\ln b\leq \ln(w_{1}a+w_{2}b)

for all

a,b\in {\mathbb {R}}^{+}

and

w_{1},w_{2}\in {\mathbb {R}}^{+}

such that

w_{1}+w_{2}=1

Infer further the weighted AM-GM inequality.

a^{w_{1}}b^{w_{2}}\leq w_{1}a+w_{2}b

Exercise 6. The Product-to-Sum Bound

Recall that the point of proving the weighted AM-GM inequality was in the hopes that it will give us a useful bound on an arbitrary product of real numbers. Since we are interested in values $1\leq p$ , this seems primarily useful only if we consider $1/p$ in order to meet the condition on the weights, and correspondingly use the equation

{\frac {1}{p}}+{\frac {1}{q}}=1

in order to have another weighted value, q, for which the theorem applies.

But note that if we consider the equality $p=1$ then there is no corresponding q. Could we take $q=\infty$ in some sense? We will return to this idea in the last section, so for now only consider $1<p$ .

Use this to infer a product-to-sum bound for the product of any two real numbers.

Hint: We said in Lesson 0: Generalizing to $L^{p}$ that we could expect to have powers of p in our bound, since it is fine if our result is related to the $\|\cdot \|_{p}$ norm. So replace a in the weighted AM-GM inequality by $a^{p}$ and so on.

Definition: conjugate

Let $1<p$ be any real number. The number defined by

{\frac {1}{p}}+{\frac {1}{q}}=1

is called the conjugate of p. In this case, we write $q=p^{*}$ .