CSSCASimulation: Simulate the data according to the CSSCA model
In syuanuvt/CSSCA: ClusterSSCA: Cluster-wise Simultaneous Component Analysis

View source: R/CSSCASimulation.R

CSSCASimulation

R Documentation

Simulate the data according to the CSSCA model

Description

Simulate the data according to the CSSCA model

Usage

CSSCASimulation(ncluster, memcluster, nblock, ncom, ndistinct, nvar,
  psparse = 0, pnoise = 0, pcombase = 0, pfixzero = 0, meancov,
  pmean)

Arguments

`ncluster`	the number of clusters that should be simulated
`memcluster`	A vector indicates the amount of entries in each cluster. The vector should be of length ncluste, with the `nth` element indicates the amount of entries in the `nth` cluster. It could also be an integer; in such cases, we assume all clusters have the same amount of entries.
`ncom`	An integer indicates the number of common components
`ndistinct`	A vector of length nblock, with the `ith` element indicates the number of distinctive components assumed for the `ith` data block. It could also be an integer; in such cases, we assume all blocks have the same amount of distinctive components.
`nvar`	A vector of length nblock, with the `ith` element indicates the number of variables assumed for the `ith` data block. It could also be an integer; in such cases, we assume all blocks have the same amount of variables.
`psparse`	A number within the range of [0,1] that indicates the sparsity level (i.e. the proportion of zero elements in the loading matrix)
`pcombase`	A number within the range of [0,1] that indicates the percentage of the "common"(i.e. identical) part in the loading matrices of various clusters. The cluster-specific part would then be (1 - pcombase). It is one of the parameter that controls for the similarities between loading matrices
`pfixzero`	A number within the range of [0,1] that indicates the percentage of the zero loadings that share the same positions over all clusters. It is one of the parameter that controls for the similarities between loading matrices.
`meancov`	Possible values: "mean' = only includes mean structure, "cov" = only includes covariance structure and "both" = includes both mean structure and co-variance structure
`nblcok`	A positive integer indicates the number of blocks (i.e. the number of data sources)
`p_noise`	A number within the range of [0,1] that indicates the percentage of noise structrue that should be added to the final data.
`meanp`	A number within the range of [0,1] that indicates the proportion of mean structure

Value

a list of six elements. The first element is a list that includes the generated final data per block; the second element is the concatenated version of the final data (concatenate the block-version data into one single dataset); the third element is the data that involves cluster difference only in co-variance structure (i.e. before adding mean structure and noise stucture) the forth element is a list of cluster-specific score matrices the fifth element is a list of cluster-specific loading matrices the last element is a vector indicates the cluster assignment (the nth element of the vector indicates the cluster assignment of the nth observation)

Examples

   n_cluster <- 3
   mem_cluster <- c(50,50,50) # 50 entries in each cluster
   n_block <- 2
   n_com <- 2
   n_distinct <- c(1,1) #1 distinctive components in each block
   n_var <- c(15,9)
   p_sparse <- 0.5
   p_noise <- 0.3
   p_combase <- 0.5 # moderate similarity
   p_fixzero <- 0.5 # moderate similarity
   mean_v  <- 0.1 # co-variance structrue dominates
 (not run)  CSSCASimulation(n_cluster, mem_cluster, n_block, n_com, n_distinct, n_var, p_sparse,
 p_noise, p_combase, p_fixzero, "both", mean_v)

syuanuvt/CSSCA documentation built on Nov. 28, 2022, 7:58 p.m.