| dmnorm | R Documentation |
This function calculates and differentiates the density of the (conditional) multivariate normal distribution.
dmnorm(
x,
mean,
sigma,
given_ind = numeric(),
log = FALSE,
grad_x = FALSE,
grad_sigma = FALSE,
is_validation = TRUE,
control = NULL,
n_cores = 1L
)
x |
numeric vector representing the point at which the density
should be calculated. If |
mean |
numeric vector representing an expectation of the multivariate normal vector (distribution). |
sigma |
positively defined numeric matrix representing the covariance matrix of the multivariate normal vector (distribution). |
given_ind |
numeric vector representing indexes of a multivariate
normal vector which are conditioned on the values of |
log |
logical; if |
grad_x |
logical; if |
grad_sigma |
logical; if |
is_validation |
logical value indicating whether input
arguments should be validated. Set it to |
control |
a list of control parameters. See Details. |
n_cores |
positive integer representing the number of CPU cores
used for parallel computing. Currently it is not recommended to set
|
Consider the notations from the Details section of
cmnorm. The function calculates the density
f(x^{(d)}|x^{(g)}) of conditioned multivariate normal vector
X_{I_{d}} | X_{I_{g}} = x^{(g)}, where x^{(d)} is a subvector of
x associated with X_{I_{d}}, i.e., the unconditioned components.
Therefore x[given_ind] represents x^{(g)}, while
x[-given_ind] is x^{(d)}.
If grad_x is TRUE, then the function additionally estimates the
gradient with respect to both the unconditioned and conditioned components:
\nabla f(x^{(d)}|x^{(g)})=
\left(\frac{\partial f(x^{(d)}|x^{(g)})}{\partial x_{1}}
,...,
\frac{\partial f(x^{(d)}|x^{(g)})}{\partial x_{m}}\right),
where each x_{i} belongs either to x^{(d)} or x^{(g)}
depending on whether i\in I_{d} or i\in I_{g}.
In particular, the subgradients of the density function with respect to
x^{(d)} and x^{(g)} are of the form:
\nabla_{x^{(d)}}\ln f(x^{(d)}|x^{(g)}) =
-\left(x^{(d)}-\mu_{c}\right)\Sigma_{c}^{-1}
\nabla_{x^{(g)}}\ln f(x^{(d)}|x^{(g)}) =
-\nabla_{x^{(d)}}f(x^{(d)}|x^{(g)})\Sigma_{d,g}\Sigma_{g,g}^{-1}
If grad_sigma is TRUE, then the function additionally estimates
the gradient with respect to the elements of covariance matrix \Sigma.
For i\in I_{d} and j\in I_{d}, the function calculates:
\frac{\partial \ln f(x^{(d)}|x^{(g)})}{\partial \Sigma_{i, j}} =
\left(\frac{\partial \ln f(x^{(d)}|x^{(g)})}{\partial x_{i}} \times
\frac{\partial \ln f(x^{(d)}|x^{(g)})}{\partial x_{j}} -
\Sigma_{c,(i, j)}^{-1}\right) /
\left(1 + I(i=j)\right),
where I(i=j) is an indicator function which equals 1 when
the condition i=j is satisfied and 0 otherwise.
For i\in I_{d} and j\in I_{g}, the following formula is used:
\frac{\partial \ln f(x^{(d)}|x^{(g)})}{\partial \Sigma_{i, j}} =
-\frac{\partial \ln f(x^{(d)}|x^{(g)})}{\partial x_{i}} \times
\left(\left(x^{(g)}-\mu_{g}\right)\Sigma_{g,g}^{-1}\right)_{q_{g}(j)}-
-\sum\limits_{k=1}^{n_{d}}(1+I(q_{d}(i)=k))\times
(\Sigma_{d,g}\Sigma_{g,g}^{-1})_{k,q_{g}(j)}\times
\frac{\partial \ln f(x^{(d)}|x^{(g)})}{\partial \Sigma_{i, q^{-1}_{d}(k)}},
where q_{g}(j)=\sum\limits_{k=1}^{m} I\left(I_{g,k} \leq j\right)
and q_{d}(i)=\sum\limits_{k=1}^{m} I\left(I_{d,k} \leq i\right)
represent the order of the i-th and j-th elements
in I_{g} and I_{d}, respectively, i.e.,
x_{i}=x^{(d)}_{q_{d}(i)}=x_{I_{d, q_{d}(i)}} and
x_{j}=x^{(g)}_{q_{g}(j)}=x_{I_{g, q_{g}(j)}}.
Note that q_{g}(j)^{-1} and q_{d}(i)^{-1} are inverse functions.
The number of conditioned and unconditioned components is denoted by
n_{g}=\sum\limits_{k=1}^{m}I(k\in I_{g}) and
n_{d}=\sum\limits_{k=1}^{m}I(k\in I_{d}) respectively.
For the case i\in I_{g} and j\in I_{d}, the formula is similar.
For i\in I_{g} and j\in I_{g}, the following formula is used:
\frac{\partial \ln f(x^{(d)}|x^{(g)})}{\partial \Sigma_{i, j}} =
-\nabla_{x^{(d)}}\ln f(x^{(d)}|x^{(g)})\times
\left(x^{(g)}\times(\Sigma_{d,g} \times \Sigma_{g,g}^{-1} \times I_{g}^{*} \times
\Sigma_{g,g}^{-1})^{T}\right)^T -
-\sum\limits_{k_{1}=1}^{n_{d}}\sum\limits_{k_{2}=k_{1}}^{n_{d}}
\frac{\partial \ln f(x^{(d)}|x^{(g)})}
{\partial \Sigma_{q_{d}(k_{1})^{-1}, q_{d}(k_{2})^{-1}}}
\left(\Sigma_{d,g} \times \Sigma_{g,g}^{-1} \times I_{g}^{*} \times
\Sigma_{g,g}^{-1}\times\Sigma_{d,g}^T\right)_{q_{d}(k_{1})^{-1},
q_{d}(k_{2})^{-1}},
where I_{g}^{*} is a square n_{g}-dimensional matrix of
zeros except I_{g,(i,j)}^{*}=I_{g,(j,i)}^{*}=1.
Argument given_ind represents I_{g} and it should not
contain any duplicates. The order of given_ind elements
does not matter, so it has no impact on the output.
More details on aforementioned differentiation formulas could be found in the appendix of E. Kossova and B. Potanin (2018).
Currently control has no input arguments intended for
the users. This argument is used for some internal purposes
of the package.
This function returns an object of class "mnorm_dmnorm".
An object of class "mnorm_dmnorm" is a list containing the
following components:
den - the density function value at x.
grad_x - a gradient of the density with respect to x, if
the grad_x or grad_sigma input argument is set to TRUE.
grad_sigma - a gradient with respect to the elements of
sigma, if the grad_sigma input argument is set to TRUE.
If log is TRUE, then den is a log-density,
so the outputs grad_x and grad_sigma are calculated with
respect to the log-density.
Output grad_x is a Jacobian matrix whose rows are the gradients of
the density function calculated for each row of x. Therefore,
grad_x[i, j] is a derivative of the density function with respect to
the j-th argument at the point x[i, ].
Output grad_sigma is a 3D array such that grad_sigma[i, j, k]
is a partial derivative of the density function with respect to the
sigma[i, j] estimated for the observation x[k, ].
E. Kossova, B. Potanin (2018). Heckman method and switching regression model multivariate generalization. Applied Econometrics, vol. 50, pages 114-143.
# Consider a multivariate normal vector:
# X = (X1, X2, X3, X4, X5) ~ N(mean, sigma)
# Prepare the multivariate normal vector parameters
# the expected value
mean <- c(-2, -1, 0, 1, 2)
n_dim <- length(mean)
# the correlation matrix
cor <- c( 1, 0.1, 0.2, 0.3, 0.4,
0.1, 1, -0.1, -0.2, -0.3,
0.2, -0.1, 1, 0.3, 0.2,
0.3, -0.2, 0.3, 1, -0.05,
0.4, -0.3, 0.2, -0.05, 1)
cor <- matrix(cor, ncol = n_dim, nrow = n_dim, byrow = TRUE)
# the covariance matrix
sd_mat <- diag(c(1, 1.5, 2, 2.5, 3))
sigma <- sd_mat %*% cor %*% t(sd_mat)
# Estimate the density of X at the point (-1, 0, 1, 2, 3)
x <- c(-1, 0, 1, 2, 3)
d.list <- dmnorm(x = x, mean = mean, sigma = sigma)
d <- d.list$den
print(d)
# Estimate the density of X at points
# x=(-1, 0, 1, 2, 3) and y=(-1.5, -1.2, 0, 1.2, 1.5)
y <- c(-1.5, -1.2, 0, 1.2, 1.5)
xy <- rbind(x, y)
d.list.1 <- dmnorm(x = xy, mean = mean, sigma = sigma)
d.1 <- d.list.1$den
print(d.1)
# Estimate the density of Xc=(X2, X4, X5 | X1 = -1, X3 = 1) at
# the point xd=(0, 2, 3) given the conditioning values xg = (-1, 1)
given_ind <- c(1, 3)
d.list.2 <- dmnorm(x = x, mean = mean, sigma = sigma,
given_ind = given_ind)
d.2 <- d.list.2$den
print(d.2)
# Estimate the gradient of the density with respect to the argument and
# covariance matrix at points 'x' and 'y'
d.list.3 <- dmnorm(x = xy, mean = mean, sigma = sigma,
grad_x = TRUE, grad_sigma = TRUE)
# Gradient with respect to the argument
grad_x.3 <- d.list.3$grad_x
# at the point 'x'
print(grad_x.3[1, ])
# at the point 'y'
print(grad_x.3[2, ])
# Partial derivative at the point 'y' with respect
# to the 3-rd argument
print(grad_x.3[2, 3])
# Gradient respect to the covariance matrix
grad_sigma.3 <- d.list.3$grad_sigma
# Partial derivative with respect to sigma(3, 5) at
# point 'y'
print(grad_sigma.3[3, 5, 2])
# Estimate the gradient of the log-density function of
# Xc=(X2, X4, X5 | X1 = -1, X3 = 1) and Yc=(X2, X4, X5 | X1 = -1.5, X3 = 0)
# with respect to the argument and covariance matrix at
# points xd=(0, 2, 3) and yd=(-1.2, 0, 1.5), respectively, given
# the conditioning values xg=(-1, 1) and yg=(-1.5, 0) correspondingly
d.list.4 <- dmnorm(x = xy, mean = mean, sigma = sigma,
grad_x = TRUE, grad_sigma = TRUE,
given_ind = given_ind, log = TRUE)
# Gradient respect to the argument
grad_x.4 <- d.list.4$grad_x
# at point 'xd'
print(grad_x.4[1, ])
# at point 'yd'
print(grad_x.4[2, ])
# Partial derivative at the point 'xd' with respect to 'xg[2]'
print(grad_x.4[1, 3])
# Partial derivative at the point 'yd' with respect to 'yd[5]'
print(grad_x.4[2, 5])
# Gradient with respect to the covariance matrix
grad_sigma.4 <- d.list.4$grad_sigma
# Partial derivative with respect to sigma(3, 5) at
# point 'yd'
print(grad_sigma.4[3, 5, 2])
# Compare analytical gradients from the previous example with
# their numeric (forward difference) analogues at point 'xd'
# given the conditioning 'xg'
delta <- 1e-6
grad_x.num <- rep(NA, 5)
grad_sigma.num <- matrix(NA, nrow = 5, ncol = 5)
for (i in 1:5)
{
x.delta <- x
x.delta[i] <- x[i] + delta
d.list.delta <- dmnorm(x = x.delta, mean = mean, sigma = sigma,
grad_x = TRUE, grad_sigma = TRUE,
given_ind = given_ind, log = TRUE)
grad_x.num[i] <- (d.list.delta$den - d.list.4$den[1]) / delta
for (j in 1:5)
{
sigma.delta <- sigma
sigma.delta[i, j] <- sigma[i, j] + delta
sigma.delta[j, i] <- sigma[j, i] + delta
d.list.delta <- dmnorm(x = x, mean = mean, sigma = sigma.delta,
grad_x = TRUE, grad_sigma = TRUE,
given_ind = given_ind, log = TRUE)
grad_sigma.num[i, j] <- (d.list.delta$den - d.list.4$den[1]) / delta
}
}
# Comparison of gradients with respect to the argument
h.x <- cbind(analytical = grad_x.4[1, ], numeric = grad_x.num)
rownames(h.x) <- c("xg[1]", "xd[1]", "xg[2]", "xd[3]", "xd[4]")
print(h.x)
# Comparison of gradients with respect to the covariance matrix
h.sigma <- list(analytical = grad_sigma.4[, , 1], numeric = grad_sigma.num)
print(h.sigma)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.