ModelSelectionCSSCA: Conduct model selection via the CHull method in the given...

View source: R/ModelSelectionCSSCA.R

ModelSelectionCSSCAR Documentation

Conduct model selection via the CHull method in the given range of number of clusters and the level of sparsity. Note that the current function should dependent on the Rdata that has been created by function "Computation" (another function in the current package). Note that the current version has demonstrated a relatively naive way for mdoel selection, an updated version is still under development which would perform the selection of two parameters iteratively To guaratee the model selection algorithm (Convex Hull method) works properly, it is require to have at least 4 elements in the selection range

Description

Conduct model selection via the CHull method in the given range of number of clusters and the level of sparsity. Note that the current function should dependent on the Rdata that has been created by function "Computation" (another function in the current package). Note that the current version has demonstrated a relatively naive way for mdoel selection, an updated version is still under development which would perform the selection of two parameters iteratively To guaratee the model selection algorithm (Convex Hull method) works properly, it is require to have at least 4 elements in the selection range

Usage

ModelSelectionCSSCA(ncluster_range, psparse_range, n_observation, n_com,
  n_distinct, n_var, complex_type = "normal")

Arguments

ncluster_range

a vector indicates the range of number of clusters that could be selected from. All elements in the vector should be positive integers. Repeatation of the elements is not allowed in the vector.

psparse_range

a vector indicates the range of the sparsiy level that could be selected from. All elements in the vector should be within the range of [0,1]. Repeatation of the elements is not allowed in the vector.

n_observation

the number of entries that are included in the dataset

n_com

A positive integer indicates the number of common components

n_distinct

A vector of length nblock, with the ith element indicates the number of distinctive components assumed for the ith data block. It could also be an integer; in such cases, we assume all blocks have the same amount of distinctive components.

n_var

A vector of length nblock, with the ith element indicates the number of variables assumed for the ith data block. It could also be an integer; in such cases, we assume all blocks have the same amount of variables.

complex_type

Categorical options: "normal" = take the level of sparsity into account when calculate the comlexity, "nonsparse" = does not take the level of sparsity into account, "lowtriangle" = only accouts for the lower-triangle matrix (instead of the higher-riangle matrix) in calculation of the complexity

Value

The selected level of sparsity and number of clusters; plots will also be created which demonstrate the Hull

Examples

(following the example used in demnstatig the simulation and calculation function)
n_cluster <- 3
mem_cluster <- c(50,50,50) # 50 entries in each cluster
n_obs <- sum(mem_cluster)
n_block <- 2
n_com <- 2
n_distinct <- c(1,1) #1 distinctive components in each block
n_var <- c(15,9)
p_sparse <- 0.5
p_noise <- 0.3
p_combase <- 0.5 # moderate similarity
p_fixzero <- 0.5 # moderate similarity
mean_v  <- 0.1 # co-variance structrue dominates
# the custimerized range for paramter selection
cluster_range <- 1:4
sparse_range <- c(0, 0.1, 0.3, 0.5)

simulate the calculate the data in customerized settings
simulate the data with the function CSSCASimulation
(not run)  CSSCASimulation(n_cluster, mem_cluster, n_block, n_com, n_distinct, n_var, p_sparse,
 p_noise, p_combase, p_fixzero, "both", mean_v)
 compute the CSSCA results in various settings and save the data in local directory
 (not run) ComputationCSSCA(sim$concatnated_data, cluster_range, sparse_range, n_block, n_com, n_distinct, n_var, computation = "easy")
 Conduct the model selection with the saved calculated results (note the original data need not to be used)
 (not run) ModelSelectionCSSCA(cluster_range, sparse_range, n_obs, n_com, n_distinct, n_var)

syuanuvt/CSSCA documentation built on Nov. 28, 2022, 7:58 p.m.