simulate_clustered_data: Simulate clustered data using the normal distribution.

Description Usage Arguments Details Examples

View source: R/simulate_clustered_data.R

Description

This function generates a matrix with n indepenent rows and N columns, where the columns are clustered into K classes with N[k] instances in class k

Usage

1
2
3
4
5
6
7
8
simulate_clustered_data(
  n = 100,
  Nk = c(40, 200),
  s = c(1, 1),
  rho = matrix(c(0.6, 0.1, 0.1, 0.25), nrow = 2, ncol = 2),
  tau = 1,
  method = c("by-class", "by-instance")
)

Arguments

n

The total number of observations per instance.

Nk

A vector of length K giving the number of within each class

s

A vector of standard deviations. See details.

rho

A matrix of correlation coefficients. See details.

tau

The within-group variance. Only used when method = "by_class".

method

Either "by_instance" or "by_class".

Details

This function generates a matrix with a block - correlation stucture across columns and independent rows. When method = 'by_instance', the values of s and rho are taken to be instance-level properties of the data. That is, s is a vector of length K such that the ith entry is the standard deviations of observations within class k and rho is a K * K symmetric matrix such that entry (i,j) gives the correlation between an instance is class i and an instance in class j. Correspondingly, entry (i,i) gives the correlation between two (different) instances in class i. In contrast, when method = 'by_class', the values of s and rho are taken to be class-level properties. The variance from the 'by_instance' characterization is broken down into a class-level variance (s^2), which gives the variability of the "true" pattern of the class over the observations" and an instance-level variance (tau^2).which gives the variabilty of the the observed instances from the true pattern. The correlation is now in terms of the classes: entry (i,j) gives the correlation between the "true" pattern of class i and the "true" pattern of class j.

Examples

1
2
rho.mat <- matrix(c(.5,.2,.2,.3),nrow = 2,ncol = 2)
X <- simulate_by_instance(Nk = c(50, 100),rho = rho.mat,n = 150)

melissakey/classCleaner documentation built on Feb. 11, 2022, 3:33 a.m.