make_data: Make a synthetic dataset

Description Usage Arguments Value References

View source: R/make_data.R

Description

Creates a random synthetic dataset for sparse multivariate regression according to the model:

Y = BX + E,

where X is the design matrix, B is the regressor matrix, Y is the response matrix, and E is the matrix error term.

Usage

1
2
3
4
make_data(n, p, q, b1 = 0.2, b2 = 0.5, sigma = 1, rho_x = 0.6,
  type = "AR1", rho_err = 0.7, h = 0.9, n_edge = 1, shift = 1,
  power = 1, zero_appeal = 1, g = 4, diag_val = 1,
  edge_val = 0.3, reps = 1, seed = NULL)

Arguments

n

number of observations (positive integer)

p

number of regressor features (positive integer)

q

number of responses (positive integer)

b1

Bernoulli parameter for controlling regressor matrix sparsity (positive integer)

b2

another Bernoulli parameter for controlling regressor matrix sparsity (positive integer)

sigma

scale of error term (positive numeric)

rho_x

autoregression parameter for design matrix covariance matrix (0 < rho < 1)

type

type of covariance matrix (string: 'AR1', 'FGN', or 'SFN')

rho_err

autoregression parameter for AR(1) covariance matrix (0 < rho < 1)

h

Hurst parameter for FGN covariance matrix (0 < h < 1)

n_edge

Barabasi algorithm number of edges per step for SFN covariance matrix (positive integer)

shift

eigenvalue shift parameter for SFN covariance matrix (shift > 0) (ensures matrix is PSD) )

power

scaling power for SFN covariance matrix (positive numeric)

zero_appeal

Barabasi algorithm baseline attractiveness for SFN covariance matrix (positive numeric)

g

number of hub nodes for HUB graph precision matrix (positive integer-valued numeric less than q)

diag_val

values of diagonal entries HUB graph precision matrix (non-negative numeric)

edge_val

values of HUB graph network edges

reps

number of randomly drawn datasets to return (positive integer)

seed

seed for pseudo-random number generator

Value

Returns a list of length reps. Each entry is itself a comprising a synthetic sparse multivariate dataset: a regressor matrix B, a design matrix X, and a response matrix Y. B has an expected sparsity of b1 x b2.

See also regressor_matrix, covariance_matrix, and tsmvr_solve.

References

\insertRef

MRCEtsmvr


spcorum/tsmvrdata documentation built on May 6, 2019, 11:17 a.m.