data.simulation: Simulates subspace clustering data

Description Usage Arguments Value Examples

View source: R/data.simulation.R

Description

Generates data for simulation with a low-rank subspace structure: variables are clustered and each cluster has a low-rank representation. Factors than span subspaces are not shared between clusters.

Usage

1
2
3
4
5
6
7
8
9
data.simulation(
  n = 100,
  SNR = 1,
  K = 10,
  numb.vars = 30,
  max.dim = 2,
  min.dim = 1,
  equal.dims = TRUE
)

Arguments

n

An integer, number of individuals.

SNR

A numeric, signal to noise ratio measured as variance of the variable, element of a subspace, to the variance of noise.

K

An integer, number of subspaces.

numb.vars

An integer, number of variables in each subspace.

max.dim

An integer, if equal.dims is TRUE then max.dim is dimension of each subspace. If equal.dims is FALSE then subspaces dimensions are drawn from uniform distribution on [min.dim,max.dim].

min.dim

An integer, minimal dimension of subspace .

equal.dims

A boolean, if TRUE (value set by default) all clusters are of the same dimension.

Value

A list consisting of:

X

matrix, generated data

signals

matrix, data without noise

dims

vector, dimensions of subspaces

factors

matrix, columns of which span subspaces

s

vector, true partiton of variables

Examples

1
2
3
4
5
sim.data <- data.simulation()
sim.data2 <- data.simulation(
  n = 30, SNR = 2, K = 5, numb.vars = 20,
  max.dim = 3, equal.dims = FALSE
)

psobczyk/varclust documentation built on June 18, 2021, 3:02 p.m.