\newcommand{\yH}{\hat{y}} \newcommand{\x}{\mathbf{X}} \newcommand{\xt}{\x^\top} \newcommand{\xtx}{\xt\x} \newcommand{\xtxI}{(\xtx)^{-1}} \newcommand{\b}{\beta} \newcommand{\bt}{\b^\top} \newcommand{\xb}{\mathbf{\x\b}} \newcommand{\ep}{\epsilon} \newcommand{\bH}{\hat{\b}} \newcommand{\bHt}{\bH^\top} \newcommand{\xbH}{\x\bH} \newcommand{\xtxBH}{(\xtx)\bH} \newcommand{\xty}{\xt y} \newcommand{\E}{\mathrm{E}} \newcommand{\s}{\sigma} \newcommand{\ss}{\s^2} \newcommand{\ssH}{\hat{\ss}} \newcommand{\xo}{\x_0} \newcommand{\xof}{\x_{0,f}} \newcommand{\xot}{\xo^\top} \newcommand{\v}{\mathrm{var}} \newcommand{\a}{\mathbf{A}} \newcommand{\xoma}{(\xo - \a)} \newcommand{\xomat}{\xoma^\top}
Consider linear model
[ y = \xb + \ep ]
Let $\bH$ be the estimate of the population parameter $\b$, we want to minimize the sum of squared residuals $\sum \ep_i^2$, with
\begin{align} \ep &= y - \xbH \ \ep^\top\ep &= (y - \xbH) (y - \xbH) \ &=y^\top y - 2\bHt\xt y + \bHt \xtx \bH \end{align}
Minimizing above, i.e, taking derivative w.r.t $\bH$ and equating to $0$, we have
[ \xtxBH = \xty ]
Provided $\xtxI$ exists:
\begin{align} \xtxI \xtxBH &= \xtxI \xty\ \implies \bH &= \xtxI \xty \end{align}
From the result above
\begin{align} \bH &= \xtxI \xty\ &= \xtxI \xt (\xb + \ep) \ &= \b + \xtxI \xt \ep \end{align}
\begin{align} \E[(\bH - \b)(\bH - \b)^\top] &= \E[(\xtxI \xt \ep) (\xtxI \xt \ep)^T] \ &= \E[\xtxI \xt \ep\ep^\top \x \xtxI] \ &= \xtxI \xt \E[\ep\ep^\top] \x \xtxI \ &= \ss \xtxI \end{align}
and
[ \ssH = \frac{\ep^\top\ep}{n - k} ]
Let $\xo$ be the design matrix containing values of the predictors, from which we want to make predictions, then
\begin{align} \hat{y} &= \xo\bH\ &= \xo\xtxI \xty \end{align}
and
\begin{align} \v{(\hat{y})} &= \v[\xo\xtxI \xty]\ &= \xo\xtxI \xt \v(y) [\xo\xtxI \xt]^\top \ &= \ss \xo\xtxI \xt \x \xtxI \xot\ &= \ss \xo \xtxI \xot \end{align}
Let $\xo$ be a centered model matrix containing appropriately chosen values of focal predictors and non-focal predictors as the average of the corresponding columns in the model matrix, and let $\a$ be the anchor matrix, with same dimensions as $\xo$ and values in each columns repeated $n$ times. Also, let $\a_f$ be the anchor column corresponding to the focal predictor, the rest of the columns corresponds the non-focal predictors with same values as in $\xo$. Any appropriate value (within the range of focal values) can be chosen for $\a_f$ but by default center point is chosen. Then
\begin{align} \v{(\hat{y})} &= \ss \xoma \xtxI \xomat \end{align}
We can see that, for any values $\a_f=\xof$, $\v(\hat{y}) = \boldsymbol{0}$. If $\a_f = \overline{\xof}$, then the anchor is the center point at this observation will still hold.
We need to figure out how to quantify and compare all the variances resulting from choosing other anchors, i.e., $\a_f < \overline{\xof}$ and $\a_f > \overline{\xof}$.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.