## Introduction

In the abstract study of partial differential equations, we often times throw the concept of weak derivatives around as if they acted just like classical derivatives. One such example is the product rule. For example, when studying evolution equations (time dependent partial differential equations such as the Navier-Stokes equations), books blow past the following equality

$\left(\frac{d}{dt}u(t),u(t)\right)=\frac{1}{2}\frac{d}{dt}|u(t)|$

where $(\cdot,\cdot)$ is some sort of inner product or integration and $|\cdot|$ is some sort of associated norm. What really happened is a coupling of a weak product rule and differentiating under the integral sign. But, to make these assumptions, we are often thinking of a “smoothed” version of the equation, where we have replaced all the nasty functions (often lying in some Hilbert space or Banach space) with appropriate “smoothified” cousins.

What does it mean for the weak derivative respect the product rule? That is, for weakly differentiable functions $u$ and $v$ with weak derivatives $u'$ and $v'$, respectively, is it the case that the weak derivative of $uv$, their product exists? If so, is it given by

$(uv)'=u'v+uv'$?

In fact, if only one of them is smooth, we can easily show that, yes, this is the case. What do we mean by smooth?

Definition: Let $\Omega\subseteq\mathbb{R}^n$ be open. We say that $v:\Omega\rightarrow\mathbb{R}$ is smooth if it is infinitely differentiable (in a classical sense). In this case, we will write that $v\in C^\infty(\Omega)$.

Theorem 1: Let $\Omega\subseteq\mathbb{R}^n$ be open. Let $u:\Omega\rightarrow\mathbb{R}$ be a weakly differentiable, and let $v\in C^\infty(\Omega)$. Then, their product $uv$ is weakly differentiable. Moreover, the weak derivative of $uv$ is given by

$(uv)'=u'v+uv'$.

Proof: Let $\phi\in C^\infty_0(\Omega)$ be a test function. Then, we have that $v\phi\in C^\infty_0(\Omega)$. Moreover,

$(v\phi)'=v'\phi+v\phi'$.

Thus, rearranging this, we have that

$\displaystyle \int_\Omega uv\phi' dx = \int_\Omega u(v\phi)'dx-\int_\Omega uv'\phi dx$.

Using the weak differentiability of $u$ on the first term on the right-hand side, we see that

$\displaystyle\int_\Omega uv\phi' dx=-\int_\Omega u'v\phi dx -\int_\Omega uv'\phi dx = -\int_\Omega(u'v+uv')\phi dx$

and the theorem is proved.

QED.

However, we don’t need to assume that $v$ is smooth. We just have to replace it with something smooth and take limits. That is the magic of mollification.

## Mollification

Mollification is the process of using convolution to replace a function with a smooth version of it which has nice limiting properties. The function we are convolutioning against is known as a mollifier.

Definition: A function $m:\mathbb{R}^n\rightarrow\mathbb{R}$ is called a mollifier if

1. $m\in C^\infty_0(\mathbb{R}^n)$.
2. $\displaystyle\int_{\mathbb{R}^n}m(x)dx = 1$.
3. For each integrable $u$,

$\displaystyle\int_{\mathbb{R}^n}u(y)\frac{1}{h^n}m\left(\frac{x-y}{h}\right)dy\rightarrow u(x)$ as $h\rightarrow 0$.

We will use a function called the standard mollifier. That is, $m:\mathbb{R}^n\rightarrow\mathbb{R}$ given by

$m(x):=\left\{\begin{array}{ll} c~exp\left(\frac{1}{|x|^2-1}\right) & |x|\le 1 \\ 0 & |x|\ge 1\end{array}\right.$

where $c\approx 2.25$ is chosen so that $\int m(x) dx=1$. We have not established property 3 in the definition of  a mollifier. That, we will leave until later. The graph for $m$ in one dimension is as follows:

Now, we see that the rescaled version of $m$ given by $m_h:=\frac{1}{h^n}m(x/n)$ looks as follows in one dimension ($h=0.1$):

Using this rescaling, we have that $m_h$ has support in $\{|x|\le h\}$, and using a simple change of variables ($x\mapsto hx$),

$\displaystyle\int_{\mathbb{R}^n}m_h(x) dx = \int_{\mathbb{R}^n}m(x) dx =1$.

Definition: Let $\Omega\subseteq\mathbb{R}^n$ be open. Let $u:\Omega\rightarrow\mathbb{R}$ be integrable. The mollification of $u$, written as $u_h$ is given by

$u_h(x):=\int_\Omega m_h(x-y)u(y) dy$ for $x\in\Omega$, $h.

Note that using differentiation under the integral sign, $u_h$ is smooth. We do this by passing the derivatives under the integral sign and onto the mollifier (which is smooth). That is, for any multi-index $\alpha$,

$\displaystyle D^\alpha u_h(x):=D^\alpha\frac{1}{h^n}\int_\Omega m\left(\frac{x-y}{h}\right)u(y) dy = \frac{1}{h^n}\int_\Omega D^\alpha m\left(\frac{x-y}{h}\right)u(y) dy$.

## Convergence Results

Next, we investigate the convergence of $u_h$ to $u$ as $h\rightarrow 0$. As usual, we start with the continuous case.

Theorem 2: Let $u\in C(\Omega)$. Then, $u_h\rightarrow u$ as $h\rightarrow 0$ uniformly on compact subsets of $\Omega$.

Proof: Let $K\subset\Omega$ be compact. Let $h (to make sense of the following integrals). Note that using the fact that $supp m_h=\{|x|\le h\}$, we have that

$\displaystyle u_h(x)=\int_{|x-y|\le h}m_h(x-y)u(y) dy$

using the change of variables $y\mapsto x-hy$. Since $u(x)=\int u(x)m(y)dy$, we have that

$\displaystyle u(x)-u_h(x)=\int_{|x-y|\le h} m_h(x-y)[u(x)-u(y)]dy$.

Thus, taking absolute values, we have that

$\displaystyle |u(x)-u_h(x)|\le \sup_{|x-y|\le h}|u(x)-u(y)|$.

Since $K$ is compact, $u$ is uniformly continuous on $K$. Thus, for any $\epsilon>0$, there is a sufficiently small $h$ so that

$\displaystyle\sup_{x\in K}|u(x)-u_h(x)|<\epsilon$.

QED.

Now, we extend to more general functions. See this article for a refresher on $L^p$ spaces.

Theorem 3: Let $u\in L^p(\Omega)$ for some open set $\Omega\subseteq \mathbb{R}^n$ and $1\le p<\infty$. Then, $u_h\in L^p(\Omega)$, and $u_h\rightarrow u$ in $L^p(\Omega)$.

Proof: If we extend $u$ to zero outside of $\Omega$, we may assume without loss of generality that $\Omega=\mathbb{R}^n$. We will start by showing that

$|u_h|_{L^p}\le|u|_{L^p}$.

First, using the change of variables $y\mapsto x-hy$ and the  with $q$ the conjugate exponent of $p$ ($1/p+1/q=1$),

$\displaystyle |u_h(x)|^p\le\left(\int_{|y|\le 1} m(y)|u(x-hy)| dy\right)^p=\left(\int_{|y|\le 1}m(y)^{1/q}m(y)^{1/p}|u(x-hy)|dy\right)^p$.

Using an application of Hölder’s inequality, this becomes

$\displaystyle |u_h(x)|^p\le\left(\int_{|y|\le 1}m(y) dy\right)^{p/q}\left(\int_{|y|\le 1}m(y)|u(x-hy)|^p dy\right)$.

Therefore, by Fubini’s theorem,

$\displaystyle \int_{\mathbb{R}^n}|u_h(x)|^p dx \le \int_{|y|\le 1} m(y)\int_{\mathbb{R}^n}|u(x-hz)|^p dx dy=|u|_{L^p}\int_{|y|\le 1} m(y)dy$

by the shift-invariance of the integral. Thus, we use that $\int m(y) dy=1$ to finish the proof that $u_h\in L^p$.

Let $\epsilon>0$. Using the density of $C_0(\mathbb{R}^n)$ in $L^p(\mathbb{R}^n)$, we can find a $\phi\in C_0(\mathbb{R}^n)$ so that $|\phi-u|_{L^p}\le \epsilon/3$. This result can be found in any measure theory book. For instance, Donald Cohn’s Measure Theory. So, by the results above, we get that

$|u_h-u|_{L^p}\le|u_h-\phi_h|_{L^p}+|\phi_h-\phi|_{L^p}+|\phi-u|_{L^p}\le 2|u-\phi|_{L^p}+|\phi_h-\phi|_{L^p}$

since $u_h+v_h=(u+v)_h$ as the reader can easily check. Letting $h\rightarrow 0$ gives us that $|\phi_h-\phi|_{L^p}\rightarrow0$ using Theorem 2. So, for small enough $h$, we can make the above terms less than $\epsilon$.

QED.

## The Product Rule

We are almost ready to prove the product rule for weak derivatives. We just need to know how mollification and weak derivatives interact.

Theorem 4: Let $u:\Omega\rightarrow\mathbb{R}$ have weak derivative $D^\alpha u$ for some multi-index $\alpha$. Then,

$D^\alpha u_h(x)=(D^\alpha u)_h(x)$.

Note that the left-hand side of the preceeding equation is the derivative of the smooth function $u_h$ in the classical sense, and the right-hand side is the mollification of the weak derivative $D^\alpha u$. In the below proof, we will specify that $D^\alpha_x$ is the (weak) derivative in the $x$-variable, and $D^\alpha_y$ is the (weak) derivative in the $y$-variable when such ambiguities must be dealt with.

Proof: Using differentiation under the integral sign, we have that

$\displaystyle D^\alpha u_h(x) =\int_\Omega \left(D^\alpha_x m_h(x-y)\right)u(y) dy=(-1)^{|\alpha|}\int_\Omega \left( D^\alpha_y m_h(x-y)\right) u(y) dy$

by the chain rule in the classical sense. Note that $D^\alpha_y m_h(x-y)\in C^\infty_0(\mathbb{R}^n)$. So, using the weak derivative property of $u$, we get that

$\displaystyle D^\alpha u_h(x)=\int_\Omega m_h(x-y)D^\alpha_y u(y) dy=(D^\alpha u)_h(x)$.

QED.

Now, we have all the machinery necessary to prove our initial theorem for the product rule. Note that we have to make the assumption that the integral of the product makes sense.

Theorem 5: Let $u,v:\Omega\rightarrow\mathbb{R}$ be weakly differentiable. Then, if the product $uv$ is integrable, it is weakly differentiable with

$(uv)'=u'v+uv'$.

Proof: Replacing $v$ with $v_h$, the mollified version of $v$, we can use Theorem 1 to get that

$(uv_h)'=u'v_h+u(v_h)'$.

By Theorem 4, $(v_h)'=(v')_h$. We can then say that the right-hand side converges to $u'v+uv'$ using Theorem 3. Also, we can show that $uv_h\rightarrow uv$ in $L^1$ using the fact that $v_h\rightarrow v$ in $L^1$ implies that there is some subsequence $v_n\rightarrow v$ almost everywhere (see any introductory book on measure theory). This, and the dominated convergence theorem give the result.

QED.

## Introduction

A few natural questions arise when we first encounter the weak derivative. Does the product rule hold? How about the chain rule? In order to answer to answer these questions, we will need some more analytical machinery. The first topic is the concept of “differentiating under the integral sign.” The question is this: suppose $f:\mathbb{R}^2\rightarrow\mathbb{R}$ is given. When does

$\displaystyle\frac{d}{dx}\int_a^b f(x,t)dt=\int_a^b\frac{\partial}{\partial x}f(x,t) dt$?

When can we interchange the operations of integrating and differentiating? Let’s start, as we usually do, with the easy case (just like in elementary calculus, where we seemed to interchange these operations like “magic”).

## The Continuously Differentiable Version

Theorem 1: Let $f\in C^1(\mathbb{R}^{n+1})$. That is, $f$ has continuous partial derivatives. Let $a. Then, for any $i\in\{1,\dots,n\}$,

$\displaystyle\frac{\partial}{\partial x_i}\int_a^b f(x,t)dt = \int_a^b\frac{\partial}{\partial x_i} f(x,t) dt$.

Proof: Without loss of generality (WLOG), let $i=1$. The other possibilities for $i$ are similar. Fix $\hat{x}:=(x_2,\dots,x_n)$ and define the function $F:\mathbb{R}:\rightarrow \mathbb{R}$ by $F(\xi):=\int_a^b f(\xi,\hat{x},t) dt$. Then, by the fundamental theorem of calculus, since partial derivatives are continuous,

$\displaystyle\int_a^b\int_0^{x_1}\frac{\partial}{\partial y} f(y,\hat{x},t)dydt=\int_a^b f(x_1,\hat{x},t)-f(0,\hat{x},t) dt=F(x_1)-F(0)$.

Switching the order of integration on the first integral, and differentiating, we again use the fundamental theorem of calculus (Lebesgue Differentiation!) to say that the left hand side of above becomes

$\displaystyle\frac{\partial}{\partial x_1}\int_0^{x_1} \int_a^b \frac{\partial}{\partial y} f(y,\hat{x},t)dtdy=\int_a^b\frac{\partial}{\partial x_1}f(x_1,\hat{x},t)dt$.

The right hand side gives us simply

$\displaystyle\frac{\partial}{\partial x_1} F(x_1)=\frac{\partial}{\partial x_1}\int_a^b f(x_1,\hat{x},t)dt$,

and we are done.

QED.

## The Measure Theoretic Version

Next, we take the usual trip into measure theory. The following theorem is taken mostly from Folland’s Real Analysis: Modern Techniques and their Applications.

Theorem 2: Let $(X,d\mu)$ be a measure space. Let $f:X\times\Omega \rightarrow \mathbb{R}$ for some open subset $\Omega \subseteq \mathbb{R}^n$ be an integrable function of $x\in X$ for each $t\in\Omega$. Moreover, suppose that for each $x\in X$, $\frac{\partial}{\partial t} f(x,t)$ exists and there is some $g$ integrable with $\left|\frac{\partial}{\partial t}f(x,t)\right|\le g(x)$ for each $t\in\Omega$. Then, for each $x\in X$, and some $i\in\{1,\dots,n\}$,

$\displaystyle\frac{\partial}{\partial t_i}\int_X f(x,t_1,\dots,t_n)d\mu=\int_X \frac{\partial}{\partial t_i}f(x,t_1,\dots,t_n)d\mu$.

Proof: For notational simplicity, we will assume that $\Omega\subseteq\mathbb{R}$. We will leave it to the reader to extrapolate to the above case. This theorem a classic exercise in the use of dominated convergence. We just need to set everything up the right way. We want that

$\displaystyle\frac{d}{ dt}\int_X f(x,t)d\mu=\lim_{h\rightarrow0}\frac{1}{h}\int_X f(x,t+h)-f(x,t)d\mu$

and then to pass the limit underneath the integral. This only requires the existence of a dominating integrable function. By the mean value theorem (or multi-dimensional mean-value theorem if we are using $t\in\Omega\subseteq\mathbb{R}^n$), there is some $c\in(t,t+h)$ so that

$\displaystyle f'(x,c)=\frac{f(x,t+h)-f(x,t)}{h}$.

Thus, we have that

$\displaystyle\left|\frac{f(x,t+h)-f(x,t)}{h}\right|\le\sup_{c\in[t,t+h]}|f'(x,c)|\le g(x)$,

an integrable function. Thus, we can pass the limit under the integral sign using Lebesgue’s dominated convergence theorem, and we are done.

QED.

## Further Questions

Though I do not delve into the subject any further at this time, one could ask the questions of how far exactly can we weaken these assumptions? What if we wanted to talk about interchanging the weak derivative and the integral? When is that possible? This post gives a good flavor of how far this subject goes. Basically, to weaken our assumptions anymore, we would need to introduce distribution theory, and I have not yet decided whether or not I’ll get into. Another time, perhaps.

## Introduction

The basis of modern PDE theory is the idea of the weak or distributional derivative. Since measure theory ignores a function’s values on a set of measure zero, why can’t we ignore some of the more “problematic points” from classical differentiability theory? For instance, take the function $f:[0,2]\rightarrow\mathbb{R}$ defined by

$\displaystyle f(x):=\left\{\begin{array}{ll} x & x\in[0,1) \\ 1 & x\in[1,2].\end{array}\right.$

This has the problem of a “cusp” or “elbow” where we are unable to define a tangent line as the following picture shows.

If we were able to “ignore” the point at $0$, then we would like to say that in some sense

$\displaystyle f'(x)=\left\{\begin{array}{ll}1&x\in[0,1) \\ 0&x\in[1,2].\end{array}\right.$

This is where the weak derivative (or distributional derivative) comes into play.

## The Weak Derivative

### The One-Dimensional Case

Let’s keep building on our intuition into this subject. Let $f,g:[0,1] \rightarrow \mathbb{R}$ be two differentiable functions. Let $g(0)=g(1)=0$. Then, elementary calculus (integration by parts) tells us that

$\displaystyle\int_0^1 f(x)g'(x)dx=-\int_0^1f'(x)g(x)dx$

since the boundary term of $f(x)g(x)$ disappears. Even if $f$ is not differentiable we might be able to make sense of the above formula. We are now ready to state our first preliminary definition of the weak derivative.

Definition: Let $f:[0,1]\rightarrow\mathbb{R}$ be any real-valued function. We will say that $h:[0,1]\rightarrow\mathbb{R}$ is the weak derivative of $f$ if for every differentiable function $g:[0,1]\rightarrow\mathbb{R}$ with $g(0)=g(1)=0$, we have that

$\displaystyle\int_0^1f(x)g'(x)dx=-\int_0^1h(x)g(x)dx$.

## The higher-dimensional version requires a little more work. First, we let $\Omega$ be some domain in $\mathbb{R}^n$ an open and connected region. For simplicity, you can think of $\Omega=\mathbb{R}^n$ or $B_n$, the open unit ball. Let $C^\infty_0(\Omega)$ be the space of infinitely differentiable functions on $\Omega$ with compact support. That is, for $\phi\in C^\infty_0(\Omega)$, the support of $\phi$ defined by $\mathrm{supp}(\phi):=\overline{\{x\in\Omega:\phi(x)\ne0\}}$ is compact in $\Omega$. We will refer to this space as our space of “test functions.” Let $\alpha:=(\alpha_1, \dots, \alpha_n) \in \mathbb{Z}^n_{\ge0}$ be a multi-index. Then, for any $\phi\in C^\infty(\mathbb{R}^n)$, define the differential operator $D^\alpha$ by

$\displaystyle D^\alpha:=\frac{\partial^{\alpha_1}}{\partial x_1^{\alpha_1}}\cdots\frac{\partial^{\alpha_n}}{\partial x_n^{\alpha_n}}$.

We are now ready to define the $n$-dimensional weak derivative.

Definition: Let $f:\Omega\rightarrow\mathbb{R}$ be given. Then, we say that $g:\Omega\rightarrow\mathbb{R}$ is the $\alpha$-weak derivative of $f$ for some multi-index $\alpha$, if for each $\phi\in C^\infty_0(\Omega)$, the following integration by parts formula holds:

$\displaystyle\int_\Omega f(x)D^\alpha\phi(x)dx=(-1)^{|\alpha|}\int_\Omega g(x)\phi(x)dx$

where $|\alpha|=|\alpha_1|+\cdots+|\alpha_n|$.

### Example 1

As an example, consider the above function

$\displaystyle f(x):=\left\{\begin{array}{ll}x&x\in[0,1) \\ 1&x\in[1,2].\end{array}\right.$

Then, for any function $\phi:[0,2]\rightarrow\mathbb{R}$ differentiable with $\phi(0)=\phi(2)=0$, we have that

$\displaystyle -\int_0^2 f(x)\phi'(x)dx=-\int_0^1 x\phi'(x)dx-\int_1^2\phi'(x)dx$.

Working with the first term in the right-hand side, we use integration by parts to get

$\displaystyle -\int_0^1 x\phi'(x)dx = -x\phi(x)|_0^1+\int_0^1 \phi(x)dx = -\phi(1)+\int_0^1\phi(x)dx$.

The fundamental theorem of calculus plus the assumption that $\phi(2)=0$ on the second term on the right-hand side gives

$\displaystyle -\int_1^2\phi'(x)dx=-\phi(2)+\phi(1)=\phi(1)$.

Putting this all together, we have that

$\displaystyle -\int_0^2 f(x)\phi'(x)dx=\int_0^1\phi(x)dx=\int_0^2 g(x)\phi(x)dx$

where $g$ is given by

$\displaystyle g(x):=\left\{\begin{array}{ll}1&x\in[0,1) \\ 0&x\in[1,2].\end{array}\right.$

Note first of all that $g$ is only defined up to a set of measure zero (if we are thinking of our integrals as Lebesgue) or up to a discrete set of points (if we are thinking of our integrals as Riemannian). Also, notice that $g$ is not even continuous which contradicts the standard real analysis proof that differentiable $g$ are necessarily continuous. This is because of the weak formulation and not a true counterexample.

### Example 2

Corners don’t seem to be a problem, but what about jumps? So, let’s consider the function

$\displaystyle f(x):=\left\{\begin{array}{ll}0&x\in[0,1) \\ 1&x\in[1,2].\end{array}\right.$

This seems like it should have a weak derivative of zero. So, let $\phi$ be a test function. Then, by the fundamental theorem of calculus and the fact that $\phi(2)=0$,

$\displaystyle -\int_0^2 f(x)\phi'(x) dx= -\int_1^2 \phi'(x)dx=-\phi(2)+\phi(1)=\phi(1)$.

On the other hand, we want this to be equal to

$\displaystyle\int_0^2 g(x)\phi(x) dx$

for some $g$ and any test function $\phi$. If such a $g$ existed, then

$\displaystyle\int_0^2 g(x)\phi(x) dx=\phi(1)$

for any test function $\phi$. Picking test functions with $\phi(1)=0$ would give us that $g=0$ almost everywhere on the interval $[0,2]$ (this is a non-trivial result most readily seen through the lens of functional analysis). On the other hand, we can certainly find test functions with $\phi(1)=k$ for any value $k\in\mathbb{R}$ (also non-trivial). So, no such $g$ exists and $f$ is not weakly differentiable.

Note: Note that the above $f$ does have a derivative if we expand our notion of derivative even further. This is the notion of the distributional derivative, in which case $f'$ is given as a delta function. However, these subjects will have to wait for now…

Posted in PDE Theory | | 4 Comments

## Introduction: The Continuous Case

As we briefly stated in my previous post, there is a particularly powerful measure theoretic tool called Lebesgue differentiation. It is a generalization of the fundamental theorem of calculus part one. Here it is, stated in the most general form for which we will prove it.

Theorem 1: Let $f:\mathbb{R}^n\rightarrow\mathbb{R}$ be integrable (denoted by $f\in L^1$). Then, for almost every $x\in\mathbb{R}^n$,

$\displaystyle\lim_{r\rightarrow0}\frac{1}{|B_r(x)|}\int_{B_r(x)} f(t) dt = f(x)$

where $B_r(x):=\{y\in\mathbb{R}^n:|x-y| is the ball centered at $x$ of radius $r>0$, and $|B_r(x)|$ is the measure (or volume) of the ball.

For the rest of this post, we will use the notation $f\in L^1$ for $f$ being integrable, and $|A|$ to denote the Lebesgue measure or volume of the set $A$.

One might wonder, what does that have to do with differentiation? One particular Corollary of the proof (left as an exercise for the reader) is the following:

Corollary 1: Let $f:\mathbb{R}\rightarrow\mathbb{R}$ be in $L^1$, and let $F$ be given by

$F(x):=\int_a^x f(t) dt$.

Then, for almost every $x\in\mathbb{R}$, we have that the derivative of $F$ at $x$ exists, and

$F'(x)=f(x)$.

We start with the continuous case which has a fairly elementary proof.

Theorem 2: Let $f:\mathbb{R}^n\rightarrow\mathbb{R}$ be continuous. Then, for every $x\in\mathbb{R}^n$,

$\displaystyle\lim_{r\rightarrow0}\frac{1}{|B_r(x)|}\int_{B_r(x)} f(t) dt = f(x)$.

Proof: Let $\epsilon>0$. Since $f$ is continuous, there exists an $r>0$ so that for $t\in B_r(x)$, $|f(t)-f(x)|<\epsilon$. Then, since

$\displaystyle\frac{1}{|B_r(x)|}\int_{B_r(x)}f(x) dt = f(x)$

(noting the inside of the integral is the fixed value $f(x)$), we have that

$\displaystyle\begin{array}{ll}\left|\frac{1}{|B_r(x)|}\int_{B_r(x)} f(t) dt - f(x)\right| &= \left|\frac{1}{|B_r(x)|}\int_{B_r(x)} f(t)-f(x) dt\right| \\ &\le \frac{1}{|B_r(x)|}\int_{B_r(x)}|f(t)-f(x)| dt \\ &<\frac{1}{|B_r(x)|}\int_{B_r(x)}\epsilon dt \\ &= \epsilon\end{array}$.

QED.

## Maximal Functions

Though many proofs exist of this result in great generality, here, we will present one based on some elementary harmonic analysis. The credit for this proof goes to Stein’s book Singular Integrals and Differentiability Properties of FunctionsWe start by introducing the concept of the maximal function.

Definition: Let $f:\mathbb{R}^n\rightarrow\mathbb{R}$ be a function. Then, the maximal function of $f$ is

$\displaystyle Mf(x):=\sup_{r>0}\frac{1}{|B_r(x)|}\int_{B_r(x)}|f(t)| dt$.

Note the similarities to the statement of the Lebesgue differentiation theorem. This definition more or less defines a “worst-case scenario” for the local integrability of the function $f$. A more rigorous treatment of the concept reveals that it is actually a fairly nice operator acting on $L^p$ spaces, but for our purposes, we only need the following property.

Lemma 1:  Let $f\in L^1$. Then, for every $\epsilon>0$

$|\{x:Mf(x)>\epsilon\}\le\frac{C}{\epsilon}||f||_1$

where

$\displaystyle||f||_1:=\int_{\mathbb{R}^n}|f(x)| dx$

denotes the $L^1$ norm of $f$ (or just skip the notation and think of it as the integral of the absolute value of $f$) and $C$ is some positive constant depending only on $n$, the dimension of the space ($C=5^n$ will work).

Although you are encouraged to read the proof the follow, it is highly technical and requires some fairly complicated measure-theoretic techniques (and a fairly technical lemma which is left unproven). The casual reader is then encouraged to just take the above lemma to heart and skip to the final section. In the language of $L^p$ spaces, what this lemma is saying, is that the maximal function is of type weak $L^1$. Perhaps, in a later post, we will revist these concepts…

To prove this, we will need the following lemma of Vitali, specifically, a version from Stein’s book Singular Integrals and Differentiability Properties of Functions. Unfortunately, we will not give a proof of this theorem, though the interested reader will find the proof in the book stated, and related proofs in many introductory measure theory texts, for example Cohn’s Measure Theory

Lemma 2: Let $E$ be a measurable subset of $\mathbb{R}^n$ which is covered by the union of a family of balls $\{B_i\}_{i\in I}$ of bounded diameter for some indexing set $I$. Then, from this family, we can select a disjoint subsequence $B_1,B_2,\dots$ either finite or countably infinite so that

$\displaystyle\sum_k |B_k|\ge C^{-1} |E|$

where $C$ is some positive constant ($5^n$ will work).

Proof of Lemma 1: Taking for granted that $Mf$ is measurable (the supremum of measurable functions is measurable), we have that the set

$E_\epsilon :=\{x:Mf(x)>\epsilon\}$

is Lebesgue measurable. So, by the definition of the maximal function, for each $x\in E_\epsilon$, there is a ball centered at $x$, denoted by $B_x$ so that

$\displaystyle||f||_1\ge\int_{B_x}|f(t)| dt \ge\epsilon|B_x|$.

Thus, we have that $|B_x|<(1/\epsilon)||f||_1$ giving us that the balls $\{B_x\}_{x\in E_\epsilon}$ have bounded diameters. Thus, by the lemma, we extract a sequence of balls $\{B_n\}$ which are mutually disjoint and satisfy

$\displaystyle\sum_{n=0}^\infty |B_n|\ge C^{-1}|E_\epsilon|$.

Putting the above concepts together, we have that

$\displaystyle||f||_1\ge\int_{\cup B_n} |f(t)| dt \ge\epsilon\sum_{n=0}^\infty |B_n|\ge\epsilon C^{-1}|E_\epsilon|$.

Rearranging the above inequality gives the desired result

$\displaystyle|\{x:Mf(x)>\epsilon\}|\le\frac{C}{\epsilon}||f||_1$.

QED.

## Proof of Lebesgue Differentiation

Now, we are prepared to prove the powerful Lebesgue differentiation theorem.

Proof of Lebesgue differentiation: Denote by $f_r$ for some $r>0$ the function

$\displaystyle f_r(x):=\frac{1}{|B_r(x)|}\int_{B_r(x)} f(t) dt$.

Then, we can restate the theorem as $f_r\rightarrow f$ almost everywhere as $r\rightarrow 0$. To this end, we introduce the following error function $Ef$.

$\displaystyle Ef(x):=|\limsup_{r\rightarrow0} f_r(x)-\liminf_{r\rightarrow0} f_r(x)|$.

Then, the points where $\lim_{r\rightarrow0} f_r(x)\ne f(x)$ are precisely where $Ef(x)>0$. To this end, let $\epsilon>0$. We need to show that

$|\{x:Ef(x)>\epsilon\}|=0$

Note that by theorem 2, for $g$ continuous, $Eg\equiv 0$. By the density of the continuous functions in $L^1$, we have that $f=h+g$ where $g$ is continuous and $||h||_1$ is as small as we like. A quick inspection of the operator $E$ reveals that

$Ef(x)=E(g+h)(x)\le Eg(x)+Eh(x)=Eh(x)$

for each $x$. Moreover, $Ef(x)\le Eh(x)\le 2Mh(x)$. So,

$\displaystyle\{x:Ef(x)>\epsilon\}\subset\{x:Mh(x)>\frac{\epsilon}{2}\}$.

Therefore, we have by the measure-theoretic property that $|A|\le |B|$ if $A\subset B$ and lemma 1,

$\displaystyle|\{x:Ef(x)>\epsilon\}|\le|\{x:Mh(x)>\frac{\epsilon}{2}\}|\le\frac{2C}{\epsilon}||h||_1$.

Letting $||h||_1\rightarrow 0$ gives the required result.

QED.

Posted in Harmonic Analysis | | 4 Comments

## Introduction and Notation

In analysis, we rarely deal with equalities. Often times you have to “settle for” an inequality. Take, for example, the study of the Navier-Stokes equations

$\displaystyle\left\{\begin{array}{l}u_t-\nu\Delta u+(u\cdot\nabla)u+\nabla p=f(x,t) \\ \nabla \cdot u=0 \end{array}\right.$

In certain cases, with some smoothing operations applied (I will give the details at a later time), we can obtain the following differential inequality:

$\displaystyle\frac{d}{dt}|u|^2+\nu\lambda|u|^2\le\frac{||f||^2_*}{\nu}$

which we will make more rigorous sense of at a later time. To analyze the above system, we will use the famous Gronwall inequality, to appear later. Using this, we are able to bound the growth of the norms of the vector-valued function $u$ to obtain the existence (in some sense) and structure of the solutions to the famous equation.

Now, we define the objects, in question for today. Suppose $x:[0,T] \rightarrow \mathbb{R}$ is differentiable. Denote by

$\displaystyle\frac{d}{dt} x=\lim_{h\rightarrow0}\frac{x(t+h)-x(t)}{h}$,

the classical derivative of $x$ at the point $t$.

## Gronwall’s Inequality: First Version

The classical Gronwall inequality is the following theorem.

Theorem 1: Let $x$ be as above. Suppose $x$ satisfies the following differential inequality

$\displaystyle\frac{d}{dt}x(t)\le g(t)x(t)+h(t)$

for $g$ continuous and $h$ locally integrable. Then, we have that

$\displaystyle x(t)\le x(0)e^{G(t)}+\int_0^te^{G(t)-G(s)}h(s) ds$,

for

$\displaystyle G(t):=\int_0^t g(r) dr$.

Proof: This is an exercise in ordinary differential equations. We introduce the integrating factor $e^{-G(t)}$ and consider the following derivative:

$\displaystyle\frac{d}{dt}e^{-G(t)}x(t)=e^{-G(t)}\frac{d}{dt}x(t)+x(t)\frac{d}{dt}e^{-G(t)}$

by the product rule. Then, we can simply the second term on the right-hand side of the equation using the chain rule and the fundamental theorem of calculus as

$\displaystyle x(t)\frac{d}{dt}e^{-G(t)}=-x(t)e^{-G(t)}\frac{d}{dt}G(t)=-x(t)e^{-G(t)}g(t)$.

Using this, and the assumed differential inequality on $x$, we have that

$\displaystyle\frac{d}{dt}e^{-G(t)}x(t)\le e^{-G(t)}[g(t)x(t)+h(t)]-g(t)x(t)e^{-G(t)}=e^{-G(t)}h(t)$.

After simplification. Now, we again use the fundamental theorem of calculus, and the fact that integrals respect inequalities to obtain that

$\displaystyle\int_0^t\frac{d}{ds}e^{-G(s)}x(s) ds=e^{-G(t)}x(t)-e^{-G(0)}x(0)\le\int_0^te^{-G(s)}h(s)ds$.

Finally, we note that $G(0)=0$ and do some simplification to get that

$\displaystyle x(t) \le x(0)e^{G(t)}+\int_0^t e^{G(t)-G(s)}h(s) ds$.

QED.

This is the most commonly-seen version of Gronwall’s inequality. However, an integral form does exist, as wikipedia is quick to point out.

## Gronwall’s Inequality: Second Version

Theorem 2: Assume that $g,h,x:[0,T]\rightarrow \mathbb{R}$ with $h$ and $x$ continuous, $g$ locally integrable, and $h$ non-negative. Suppose that $x$ satisfies the integral inequality

$\displaystyle x(t)\le g(t)+\int_0^t h(s)x(s) ds$,

then

$\displaystyle x(t)\le g(t)+\int_0^t g(s)h(s)e^{H(t)-H(s)} ds$

for $H(s):=\int_0^s h(r) dr$.

Proof: Define the function $y$ as

$\displaystyle y(s):=e^{-H(s)}\int_0^s h(r)x(r) dr$.

Then, differentiating, and using the product rule, chain rule, and fundamental theorem of calculus, we have, as before, that

$\displaystyle\frac{d}{ds}y(s)=e^{-H(s)}h(s)x(s)-h(s)e^{-H(s)}\int_0^s h(r)x(r) dr=h(s)e^{-H(s)}(x(s)-\int_0^s h(r)x(r) dr)$.

Thus, using our assumed inequality, we have that

$\displaystyle\frac{d}{ds}y(s)\le g(s)h(s)e^{-H(s)}$.

Note that this last step is where we need to assume that $h(s)\ge 0$. Then, integrating both sides from $0$ to $t$, and noting that $y(0)=0$, we have that

$\displaystyle y(t)=e^{-H(t)}\int_0^t h(r)x(r) dr\le\int_0^t g(s)h(s)e^{-H(s)} ds$.

By the original integral inequality, and after rearranging the terms, we get that

$\displaystyle x(t)\le g(t)+\int_0^t g(s)h(s)e^{H(t)-H(s)} ds$.

QED.

## Gronwall’s Inequality: Third Version

Both of the above theorems required the use of the fundamental theorems of calculus, and the continuity of the functions involved to invoke it. However, if we wanted to weaken the requirements on the functions involved, we simply need to invoke Lebesgue differentiation, a generalized version of the first part of the fundamental theorem of calculus. Here, we state the one-dimensional version, though higher dimensional versions exist in most measure theory and harmonic analysis texts.

Lemma 1: (Lebesgue Differentiation Theorem) Let $f: \mathbb{R} \rightarrow \mathbb{R}$ be Lebesgue integrable. Then, for almost every $x\in\mathbb{R}$,

$\displaystyle F(x):=\int_a^x f(t) dt$

is differentiable. Moreover, for such an $x$, $F'(x)=f(x)$.

Similarly, we have the following generalization of the second fundamental theorem of calculus:

Lemma 2: If $f: \mathbb{R}\rightarrow \mathbb{R}$ is integrable, and for every $x\in[a,b]$, there exists a function $F$ so that $F'(x)=f(x)$, then

$\displaystyle\int_a^b f(t) dt = F(b)-F(a)$.

In fact, we could even weaken our sense of what “derivative” might mean in this measure-theoretic sense. Even so, we have the following generalized version of theorem 1. Proven using the same methods, and the above two lemmas.

Theorem 3: Let $x:[0,T]\rightarrow\mathbb{R}$ be integrable and differentiable on $[0,T]$. Suppose $g,h:[0,T]\rightarrow\mathbb{R}$ are integrable so that

$\displaystyle\frac{d}{dt}x(t)\le g(t)x(t)+h(t)$

for almost every $t\in[0,T]$. Then,

$\displaystyle x(t)\le x(0)e^{G(t)}+\int_0^t e^{G(t)-G(s)}h(s) ds$

for almost every $t\in[0,T]$ and

$\displaystyle G(t):=\int_0^t g(r) dr$.

Posted in ODE Theory | | 1 Comment