Description Usage Arguments Details Value References Examples
We generate n_k observations (k = 1, …, K) from each of K multivariate normal distributions. Let the kth population have a p-dimensional multivariate normal distribution, N_p(μ_k, Σ_k) with mean vector μ_k and positive-definite covariance matrix Σ_k. Each covariance matrix Σ_k consists of block-diagonal autocorrelation matrices.
1 2 | simdata_guo(n, mean, block_size, num_blocks, rho,
sigma2 = 1, seed = NULL)
|
n |
a vector (of length K) of the sample sizes for each population |
mean |
a vector or a list (of length K) of mean vectors |
block_size |
a vector (of length K) of the sizes of the square block matrices for each population. See details. |
num_blocks |
a vector (of length K) giving the number of block matrices for each population. See details. |
rho |
a vector (of length K) of the values of the autocorrelation parameter for each class covariance matrix |
sigma2 |
a vector (of length K) of the variance coefficients for each class covariance matrix |
seed |
seed for random number generation (If
|
The kth class covariance matrix is defined as
Σ_k = Σ^{(ρ)} \oplus Σ^{(-ρ)} \oplus … \oplus Σ^{(ρ)},
where \oplus denotes the direct sum and the (i,j)th entry of Σ^{(ρ)} is
Σ_{ij}^{(ρ)} = \{ ρ^{|i - j|} \}.
The matrix Σ^{(ρ)} is referred to as a
block. Its dimensions are provided in the
block_size
argument, and the number of blocks are
specified in the num_blocks
argument.
Each matrix Σ_k is generated by the
cov_block_autocorrelation
function.
The number of populations, K
, is determined from
the length of the vector of sample sizes, coden. The
mean vectors can be given in a list of length K
.
If one mean is given (as a vector or a list having 1
element), then all populations share this common mean.
The block sizes can be given as a numeric vector or a
single value, in which case the degrees of freedom is
replicated K
times. The same logic applies to
num_blocks
, rho
, and sigma2
.
For each class, the number of features, p
, is
computed as block_size * num_blocks
. The values
for p
must agree for each class.
The block-diagonal covariance matrix with autocorrelated blocks was popularized by Guo et al. (2007) for studying classification of high-dimensional data.
named list containing:
A matrix
whose rows are the observations generated and whose
columns are the p
features (variables)
A vector denoting the population from which the observation in each row was generated.
Guo, Y., Hastie, T., & Tibshirani, R. (2007). "Regularized linear discriminant analysis and its application in microarrays," Biostatistics, 8, 1, 86-100.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | # Generates 10 observations from two multivariate normal populations having
# a block-diagonal autocorrelation structure.
block_size <- 3
num_blocks <- 3
p <- block_size * num_blocks
means_list <- list(seq_len(p), -seq_len(p))
data <- simdata_guo(n = c(10, 10), mean = means_list, block_size = block_size,
num_blocks = num_blocks, rho = 0.9, seed = 42)
dim(data$x)
table(data$y)
# Generates 15 observations from each of three multivariate normal
# populations having block-diagonal autocorrelation structures. The
# covariance matrices are unequal.
p <- 16
block_size <- c(2, 4, 8)
num_blocks <- p / block_size
rho <- c(0.1, 0.5, 0.9)
sigma2 <- 1:3
mean_list <- list(rep.int(-5, p), rep.int(0, p), rep.int(5, p))
set.seed(42)
data2 <- simdata_guo(n = c(15, 15, 15), mean = mean_list,
block_size = block_size, num_blocks = num_blocks,
rho = rho, sigma2 = sigma2)
dim(data2$x)
table(data2$y)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.