cmnorm: Parameters of the conditional multivariate normal...

View source: R/RcppExports.R

cmnormR Documentation

Parameters of the conditional multivariate normal distribution

Description

This function calculates the mean (expectation) and the covariance matrix of the conditional multivariate normal distribution.

Usage

cmnorm(
  mean,
  sigma,
  given_ind,
  given_x,
  dependent_ind = numeric(),
  is_validation = TRUE,
  is_names = TRUE,
  control = NULL,
  n_cores = 1L
)

Arguments

mean

numeric vector representing an expectation of the multivariate normal vector (distribution).

sigma

positively defined numeric matrix representing the covariance matrix of the multivariate normal vector (distribution).

given_ind

numeric vector representing indexes of a multivariate normal vector which are conditioned on the values given by the given_x argument.

given_x

numeric vector whose i-th element corresponds to the given value of the given_ind[i]-th element (component) of a multivariate normal vector. If given_x is a numeric matrix then its rows are such vectors of given values.

dependent_ind

numeric vector representing indexes of the unconditioned elements (components) of a multivariate normal vector.

is_validation

logical value indicating whether input arguments should be validated. Set it to FALSE to get a performance boost (default value is TRUE).

is_names

logical value indicating whether output values should have row and column names. Set it to FALSE to get a performance boost (default value is TRUE).

control

a list of control parameters. See Details.

n_cores

positive integer representing the number of CPU cores used for parallel computing. Currently it is not recommended to set n_cores > 1 if vectorized arguments include less than 100000 elements.

Details

Consider an m-dimensional multivariate normal vector X=(X_{1},...,X_{m})^{T}~\sim N(\mu,\Sigma), where E(X)=\mu and Cov(X)=\Sigma are the expectation (mean) and covariance matrix respectively.

Let's denote the vectors of indexes of the conditioned and unconditioned elements of X by I_{g} and I_{d} respectively. By x^{(g)} denote a deterministic (column) vector of given values of X_{I_{g}}. The function calculates the expected value and the covariance matrix of conditioned multivariate normal vector X_{I_{d}} | X_{I_{g}} = x^{(g)}. For example, if I_{g}=(1, 3) and x^{(g)}=(-1, 1) then I_{d}=(2, 4, 5) so the function calculates:

\mu_{c}=E\left(\left(X_{2}, X_{4}, X_{5}\right) | X_{1} = -1, X_{3} = 1\right)

\Sigma_{c}=Cov\left(\left(X_{2}, X_{4}, X_{5}\right) | X_{1} = -1, X_{3} = 1\right)

In the general case:

\mu_{c} = E\left(X_{I_{d}} | X_{I_{g}} = x^{(g)}\right) = \mu_{I_{d}} + \left(x^{(g)} - \mu_{I_{g}}\right) \left(\Sigma_{(I_{d}, I_{g})} \Sigma_{(I_{g}, I_{g})}^{-1}\right)^{T}

\Sigma_{c} = Cov\left(X_{I_{d}} | X_{I_{g}} = x^{(g)}\right) = \Sigma_{(I_{d}, I_{d})} - \Sigma_{(I_{d}, I_{g})} \Sigma_{(I_{g}, I_{g})}^{-1} \Sigma_{(I_{g}, I_{d})}

Note that \Sigma_{(A, B)}, where A,B\in\{d, g\}, is a submatrix of \Sigma generated by the intersection of I_{A} rows and the I_{B} columns of \Sigma.

Below there is a correspondence between aforementioned theoretical (mathematical) notations and function arguments:

  • mean - \mu.

  • sigma - \Sigma.

  • given_ind - I_{g}.

  • given_x - x^{(g)}.

  • dependent_ind - I_{d}.

Moreover \Sigma_{(I_{g}, I_{d})} is a theoretical (mathematical) notation for sigma[given_ind, dependent_ind]. Similarly \mu_{g} represents mean[given_ind].

By default dependent_ind contains all indexes that are not in given_ind. It is possible to omit and duplicate indexes of dependent_ind. But at least one index should be provided for given_ind, without any duplicates. Also dependent_ind and given_ind should not have the same elements. Moreover, given_ind should not have the same length as mean; thus, at least one component should be unconditioned.

If given_x is a vector, then (if possible) it will be treated as a matrix with the number of columns equal to the length of mean.

Currently control has no input arguments intended for the users. This argument is used for some internal purposes of the package.

Value

This function returns an object of class "mnorm_cmnorm".

An object of class "mnorm_cmnorm" is a list containing the following components:

  • mean - a conditional mean.

  • sigma - a conditional covariance matrix.

  • sigma_d - a covariance matrix of the unconditioned elements.

  • sigma_g - a covariance matrix of the conditioned elements.

  • sigma_dg - a matrix of the covariances between the unconditioned and conditioned elements.

  • s12s22 - equals the matrix product of sigma_dg and solve(sigma_g).

Note that mean corresponds to \mu_{c}, while sigma represents \Sigma_{c}. Moreover, sigma_d is \Sigma_{I_{d}, I_{d}}, sigma_g is \Sigma_{I_{g}, I_{g}} and sigma_dg is \Sigma_{I_{d}, I_{g}}.

Since \Sigma_{c} does not depend on X^{(g)}, the output sigma does not depend on given_x. In particular, output sigma remains the same independent of whether given_x is a matrix or a vector. Conversely, if given_x is a matrix, then output mean is a matrix whose rows correspond to the conditional means associated with the given values provided by the corresponding rows of given_x.

The order of the elements of output mean and output sigma depends on the order of dependent_ind elements (which is ascending by default). The order of given_ind elements does not matter. However, please check that the order of given_ind matches the order of given values, i.e., the order of given_x columns.

Examples

# Consider a multivariate normal vector:
# X = (X1, X2, X3, X4, X5) ~ N(mean, sigma)

# Prepare the multivariate normal vector parameters
  # the expected value
mean <- c(-2, -1, 0, 1, 2)
n_dim <- length(mean)
  # the correlation matrix
cor <- c(   1,  0.1,  0.2,   0.3,  0.4,
          0.1,    1, -0.1,  -0.2, -0.3,
          0.2, -0.1,    1,   0.3,  0.2,
          0.3, -0.2,  0.3,     1, -0.05,
          0.4, -0.3,  0.2, -0.05,     1)
cor <- matrix(cor, ncol = n_dim, nrow = n_dim, byrow = TRUE)
  # the covariance matrix
sd_mat <- diag(c(1, 1.5, 2, 2.5, 3))
sigma <- sd_mat %*% cor %*% t(sd_mat)

# Estimate the parameters of the conditional distribution, i.e.,
# when the first and the third components of X are conditioned:
# (X2, X4, X5 | X1 = -1, X3 = 1)
given_ind <- c(1, 3)
given_x <- c(-1, 1)
par <- cmnorm(mean = mean, sigma = sigma,
              given_ind = given_ind,
              given_x = given_x)
  # E(X2, X4, X5 | X1 = -1, X3 = 1)
par$mean
  # Cov(X2, X4, X5 | X1 = -1, X3 = 1)
par$sigma

# Additionally, calculate E(X2, X4, X5 | X1 = 2, X3 = 3)
given_x_mat <- rbind(given_x, c(2, 3))
par1 <- cmnorm(mean = mean, sigma = sigma,
               given_ind = given_ind,
               given_x = given_x_mat)
par1$mean

# Duplicates and omitted indexes are allowed for dependent_ind
# For given_ind, duplicates are not allowed
# Let's calculate the conditional parameters 
# for (X5, X2, X5 | X1 = -1, X3 = 1):
dependent_ind <- c(5, 2, 5)
par2 <- cmnorm(mean = mean, sigma = sigma,
               given_ind = given_ind,
               given_x = given_x,
               dependent_ind = dependent_ind)
  # E(X5, X2, X5 | X1 = -1, X3 = 1)
par2$mean
  # Cov(X5, X2, X5 | X1 = -1, X3 = 1)
par2$sigma

mnorm documentation built on April 14, 2026, 5:07 p.m.

Related to cmnorm in mnorm...