estimateR: Estimate latent correlation matrix

View source: R/estimateR.R

estimateRR Documentation

Estimate latent correlation matrix

Description

Estimation of latent correlation matrix from observed data of (possibly) mixed types (continuous/binary/truncated continuous) based on the latent Gaussian copula model.

Usage

estimateR(
  X,
  type = "trunc",
  method = "original",
  use.nearPD = TRUE,
  nu = 0.01,
  tol = 0.001,
  verbose = FALSE
)

estimateR_mixed(
  X1,
  X2,
  type1 = "trunc",
  type2 = "continuous",
  method = "original",
  use.nearPD = TRUE,
  nu = 0.01,
  tol = 0.001,
  verbose = FALSE
)

Arguments

X

A numeric data matrix (n by p), n is the sample size and p is the number of variables.

type

A type of variables in X, must be one of "continuous", "binary" or "trunc".

method

The calculation method of latent correlation. Either "original" method or "approx". If method = "approx", multilinear approximation method is used, which is much faster than the original method (requires chebpol R package). If method = "original", optimization of the bridge inverse function is used. The default is "original".

use.nearPD

A logical value indicating whether to use nearPD or not when the resulting correlation estimator is not positive definite (have at least one negative eigenvalue).

nu

Shrinkage parameter for correlation matrix, must be between 0 and 1, the default value is 0.01.

tol

Desired accuracy when calculating the solution of bridge function.

verbose

If verbose = FALSE, printing information whether nearPD is used or not is disabled. The default value is FALSE.

X1

A numeric data matrix (n by p1).

X2

A numeric data matrix (n by p2).

type1

A type of variables in X1, must be one of "continuous", "binary" or "trunc".

type2

A type of variables in X2, must be one of "continuous", "binary" or "trunc".

Value

estimateR returns

  • type: Type of the data matrix X

  • R: Estimated p by p latent correlation matrix of X

estimateR_mixed returns

  • type1: Type of the data matrix X1

  • type2: Type of the data matrix X2

  • R: Estimated latent correlation matrix of whole X = (X1, X2) (p1+p2 by p1+p2)

  • R1: Estimated latent correlation matrix of X1 (p1 by p1)

  • R2: Estimated latent correlation matrix of X2 (p2 by p2)

  • R12: Estimated latent correlation matrix between X1 and X2 (p1 by p2)

References

Fan J., Liu H., Ning Y. and Zou H. (2017) "High dimensional semiparametric latent graphicalmodel for mixed data" <doi:10.1111/rssb.12168>.

Yoon G., Carroll R.J. and Gaynanova I. (2020) "Sparse semiparametric canonical correlation analysis for data of mixed types" <doi:10.1093/biomet/asaa007>.

Yoon G., Mueller C.L., Gaynanova I. (2020) "Fast computation of latent correlations" <arXiv:2006.13875>.

Examples

### Data setting
n <- 100; p1 <- 15; p2 <- 10 # sample size and dimensions for two datasets.
maxcancor <- 0.9 # true canonical correlation

### Correlation structure within each data set
set.seed(0)
perm1 <- sample(1:p1, size = p1);
Sigma1 <- autocor(p1, 0.7)[perm1, perm1]
blockind <- sample(1:3, size = p2, replace = TRUE);
Sigma2 <- blockcor(blockind, 0.7)
mu <- rbinom(p1+p2, 1, 0.5)

### true variable indices for each dataset
trueidx1 <- c(rep(1, 3), rep(0, p1-3))
trueidx2 <- c(rep(1, 2), rep(0, p2-2))

### Data generation
simdata <- GenerateData(n=n, trueidx1 = trueidx1, trueidx2 = trueidx2, maxcancor = maxcancor,
                        Sigma1 = Sigma1, Sigma2 = Sigma2,
                        copula1 = "exp", copula2 = "cube",
                        muZ = mu,
                        type1 = "trunc", type2 = "continuous",
                        c1 = rep(1, p1), c2 =  rep(0, p2)
)
X1 <- simdata$X1
X2 <- simdata$X2

### Check the range of truncation levels of variables
range(colMeans(X1 == 0))
range(colMeans(X2 == 0))

### Estimate latent correlation matrix
# with original method
R1_org <- estimateR(X1, type = "trunc", method = "original")$R
# with faster approximation method
R1_approx <- estimateR(X1, type = "trunc", method = "approx")$R
R12_approx <- estimateR_mixed(X1, X2, type1 = "trunc", type2 = "continuous", method = "approx")$R12



mixedCCA documentation built on Sept. 10, 2022, 1:06 a.m.