Here’s a new “overthinking it” issue for you. For a while now, I’ve been relying on the high-dimensional version of a Taylor series approximation result to help me flip variables and gradients. That is, for 1-D functions with continuous Hessians everywhere, the mean value theorem says that there always exists a $z$ where $x\leq z \leq y$, such that

$$

g”(z) (x-y)=g'(x)-g'(y).

$$

Without blinking, we use this to extend to $f:\mathbb R^n\to \mathbb R$, and say that there always exists some $z$ where

$$

\nabla^2 f(z) (x-y) = \nabla f(x) – \nabla f(y).\tag{MVT-hd}

$$

Never mind that the notion of “between $x$ and $y$” doesn’t really make sense. We just love the result, and use it to flip primal and dual solutions, based on the qualities of $\nabla^2 f(z)$, willy nilly.

But, one thing I am realizing as I go through my convex analysis adventure, is that* everything that is true in high dimensions can be proven using only tools in low dimensions. * And by low dimensions, I mean 1-dimensional. So, I should be able to, using only 1-D function knowledge, prove (MVT-hd).

First, (MVT-hd) is true if

$$

u_k^T\nabla^2 f(z) (x-y) =u_k^T( \nabla f(x) – \nabla f(y)).\tag{MVT-hd}

$$

for some set of $u_1,u_2,…,u_n$ spanning $\mathbb R^n$. To make life easy, let’s pick the standard basis. Then,

$$

e_k^T\nabla^2 f(z)(x-y) = \sum_{i=1}^n \frac{\partial f(u)^2}{\partial u_k\partial u_k}\Bigg|_{u=z} \cdot (x_i-y_i)

$$

and

$$

e_k^T(\nabla f(x)-\nabla f(y)) = \frac{\partial f(u)}{\partial u_k}\Bigg|_{u=x}- \frac{\partial f(u)}{\partial u_k}\Bigg|_{u=y}

$$

Let us now define some dummy 1-D functions as

$$

g_k(\alpha) = \frac{\partial f(u)}{\partial u_k}\Bigg|_{u=x+\alpha(y-x)}

$$

Then,

$$

e_k^T(\nabla f(x)-\nabla f(y))=g_k(1)-g_k(0)

$$

and

$$

g’_k(\alpha) = \sum_{j=1}^n \frac{\partial^2 f(u)}{\partial u_k\partial u_j}\Bigg|_{u=x+\alpha(y-x)}.\cdot (y_j-x_j)

$$

Now this is very promising! because now we can invoke MVT on $g_k(\alpha)$ and say that there must exist some $0\leq \bar \alpha \leq 1$ where

$$

g_k(1)-g_k(0)=g’_k(\bar\alpha)\iff e_k^T(\nabla f(x)-\nabla f(y)) = \sum_{j=1}^n \frac{\partial^2 f(u)}{\partial u_k\partial u_j}\Bigg|_{u=x+\bar \alpha(y-x)}\cdot (y_j-x_j).

$$

However, there’s one catch: there’s no reason to assume that this $\bar\alpha$ is the *same* point for each $k$. So, while (MVT-hd) has *one* $z$ for its relation, here, we’ve got $n$ *different* $\bar \alpha_k$s, corresponding to each $g_k$.

Of course, you could always create a new function $g(\alpha) = \sum_k u_k g_k(\alpha)$, and following the same analysis, you could show that, for any $u$, there exists some $z$ where

$$

u^T\nabla^2 f(z) (x-y) =u^T( \nabla f(x) – \nabla f(y)).\tag{MVT-hd-proj}

$$

It turns out that this alone is enough for me to prove most of what I need. However, right now I am hesitant to say that one unique $z$ exists for *all* $u$.

I hope to be proven wrong, but in the meantime, I shall begin my search for a counterexample!