Home

/

GitHub

/

irinagain/mixedCCA

/

GenerateData: Mixed type simulation data generator for sparse CCA

GenerateData: Mixed type simulation data generator for sparse CCA
In irinagain/mixedCCA: Sparse Canonical Correlation Analysis for High-Dimensional Mixed Data

View source: R/GenerateData.R

GenerateData

R Documentation

Mixed type simulation data generator for sparse CCA

Description

GenerateData is used to generate two sets of data of mixed types for sparse CCA under the Gaussian copula model.

Usage

GenerateData(
  n,
  trueidx1,
  trueidx2,
  Sigma1,
  Sigma2,
  maxcancor,
  copula1 = "no",
  copula2 = "no",
  type1 = "continuous",
  type2 = "continuous",
  muZ = NULL,
  c1 = NULL,
  c2 = NULL
)

Arguments

`n`	Sample size
`trueidx1`	True canonical direction of length p1 for `X1`. It will be automatically normalized such that w_1^T Σ_1 w_1 = 1.
`trueidx2`	True canonical direction of length p2 for `X2`. It will be automatically normalized such that w_2^T Σ_2 w_2 = 1.
`Sigma1`	True correlation matrix of latent variable `Z1` (p1 by p1).
`Sigma2`	True correlation matrix of latent variable `Z2` (p2 by p2).
`maxcancor`	True canonical correlation between `Z1` and `Z2`.
`copula1`	Copula type for the first dataset. U1 = f(Z1), which could be either "exp", "cube".
`copula2`	Copula type for the second dataset. U2 = f(Z2), which could be either "exp", "cube".
`type1`	Type of the first dataset `X1`. Could be "continuous", "trunc" or "binary".
`type2`	Type of the second dataset `X2`. Could be "continuous", "trunc" or "binary".
`muZ`	Mean of latent multivariate normal.
`c1`	Constant threshold for `X1` needed for "trunc" and "binary" data type - the default is NULL.
`c2`	Constant threshold for `X2` needed for "trunc" and "binary" data type - the default is NULL.

Value

GenerateData returns a list containing

Z1: latent numeric data matrix (n by p1).
Z2: latent numeric data matrix (n by p2).
X1: observed numeric data matrix (n by p1).
X2: observed numeric data matrix (n by p2).
true_w1: normalized true canonical direction of length p1 for X1.
true_w2: normalized true canonical direction of length p2 for X2.
type: a vector containing types of two datasets.
maxcancor: true canonical correlation between Z1 and Z2.
c1: constant threshold for X1 for "trunc" and "binary" data type.
c2: constant threshold for X2 for "trunc" and "binary" data type.
Sigma: true latent correlation matrix of Z1 and Z2 ((p1+p2) by (p1+p2)).

Examples

### Simple example

# Data setting
n <- 100; p1 <- 15; p2 <- 10 # sample size and dimensions for two datasets.
maxcancor <- 0.9 # true canonical correlation

# Correlation structure within each data set
set.seed(0)
perm1 <- sample(1:p1, size = p1);
Sigma1 <- autocor(p1, 0.7)[perm1, perm1]
blockind <- sample(1:3, size = p2, replace = TRUE);
Sigma2 <- blockcor(blockind, 0.7)
mu <- rbinom(p1+p2, 1, 0.5)

# true variable indices for each dataset
trueidx1 <- c(rep(1, 3), rep(0, p1-3))
trueidx2 <- c(rep(1, 2), rep(0, p2-2))

# Data generation
simdata <- GenerateData(n=n, trueidx1 = trueidx1, trueidx2 = trueidx2, maxcancor = maxcancor,
                        Sigma1 = Sigma1, Sigma2 = Sigma2,
                        copula1 = "exp", copula2 = "cube",
                        muZ = mu,
                        type1 = "trunc", type2 = "trunc",
                        c1 = rep(1, p1), c2 =  rep(0, p2)
)
X1 <- simdata$X1
X2 <- simdata$X2

# Check the range of truncation levels of variables
range(colMeans(X1 == 0))
range(colMeans(X2 == 0))

irinagain/mixedCCA documentation built on Sept. 11, 2022, 2:10 p.m.

irinagain/mixedCCA index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

irinagain/mixedCCA
Sparse Canonical Correlation Analysis for High-Dimensional Mixed Data

GenerateData: Mixed type simulation data generator for sparse CCA
In irinagain/mixedCCA: Sparse Canonical Correlation Analysis for High-Dimensional Mixed Data

Mixed type simulation data generator for sparse CCA

Description

Usage

Arguments

Value

Examples

Related to GenerateData in irinagain/mixedCCA...

R Package Documentation

Browse R Packages

We want your feedback!

irinagain/mixedCCA Sparse Canonical Correlation Analysis for High-Dimensional Mixed Data

GenerateData: Mixed type simulation data generator for sparse CCA In irinagain/mixedCCA: Sparse Canonical Correlation Analysis for High-Dimensional Mixed Data

Mixed type simulation data generator for sparse CCA

Description

Usage

Arguments

Value

Examples

Related to GenerateData in irinagain/mixedCCA...

R Package Documentation

Browse R Packages

We want your feedback!

irinagain/mixedCCA
Sparse Canonical Correlation Analysis for High-Dimensional Mixed Data

GenerateData: Mixed type simulation data generator for sparse CCA
In irinagain/mixedCCA: Sparse Canonical Correlation Analysis for High-Dimensional Mixed Data