This function generates K
multivariate normal data sets, where each
class is generated with a constant mean vector and a covariance matrix
consisting of blockdiagonal autocorrelation matrices. The data are returned
as a single matrix x
along with a vector of class labels y
that
indicates class membership.
1  generate_blockdiag(n, mu, num_blocks, block_size, rho, sigma2 = rep(1, K))

n 
vector of the sample sizes of each class. The length of 
mu 
matrix containing the mean vectors for each class. Expected to have

num_blocks 
the number of block matrices. See details. 
block_size 
the dimensions of the square block matrix. See details. 
rho 
vector of the values of the autocorrelation parameter for each
class covariance matrix. Must equal the length of 
sigma2 
vector of the variance coefficients for each class covariance
matrix. Must equal the length of 
For simplicity, we assume that a class mean vector is constant for each feature. That is, we assume that the mean vector of the kth class is c_k * j_p, where j_p is a p \times 1 vector of ones and c_k is a real scalar.
The kth class covariance matrix is defined as
Σ_k = Σ^{(ρ)} \oplus Σ^{(ρ)} \oplus … \oplus Σ^{(ρ)},
where \oplus denotes the direct sum and the (i,j)th entry of Σ^{(ρ)} is
Σ_{ij}^{(ρ)} = \{ ρ^{i  j} \}.
The matrix Σ^{(ρ)} is referred to as a block. Its dimensions
are provided in the block_size
argument, and the number of blocks are
specified in the num_blocks
argument.
Each matrix Σ_k is generated by the
cov_block_autocorrelation
function.
The number of classes K
is determined with lazy evaluation as the
length of n
.
The number of features p
is computed as block_size *
num_blocks
.
named list with elements:
x
: matrix of observations with n
rows and p
columns
y
: vector of class labels that indicates class membership for
each observation (row) in x
.
1 2 3 4 5 6 7 8 9 10 11 12 13  # Generates data from K = 3 classes.
means < matrix(rep(1:3, each=9), ncol=3)
data < generate_blockdiag(n = c(15, 15, 15), block_size = 3, num_blocks = 3,
rho = seq(.1, .9, length = 3), mu = means)
data$x
data$y
# Generates data from K = 4 classes. Notice that we use specify a variance.
means < matrix(rep(1:4, each=9), ncol=4)
data < generate_blockdiag(n = c(15, 15, 15, 20), block_size = 3, num_blocks = 3,
rho = seq(.1, .9, length = 4), mu = means)
data$x
data$y

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.