sim.student: Simulate data for benchmarking Student-T regression models.

Description Usage Arguments Value Author(s) Examples

View source: R/simfunc.R

Description

Simulate data for benchmarking Student-T regression models.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
sim.student(
  n = 100,
  p = 25,
  rho = 0.5,
  coefs = c(1.5, 4, -2, -4, 1, 2, -2.5),
  snr = 2,
  noise.df = 3,
  scale = TRUE,
  cormat = NULL,
  seed = 100
)

Arguments

n

Number of observations.

p

Number of predictors.

rho

Correlation for generating correlated variables.

coefs

Vector of non-zero coefficients

snr

Signal to noise ratio (SNR). Defaults to 2. SNR is defined as

\frac{Var(E(y | X))}{Var(Y - E(y | X))} = \frac{Var(f(X))}{Var(\varepsilon)} = \frac{Var(X^T β)}{Var(\varepsilon)} = \frac{Var(β^T Σ β)}{σ^2}.

noise.df

The degrees of freedom for the noise distribution. Defaults to Inf (Gaussian).

scale

should the data be scaled? Defaults to TRUE.

seed

Random seed for reproducibility.

Value

a data frame with an attribute "true.betas" that contains the true coefficients. If scale = TRUE, the coefficients are scaled to match.

Author(s)

Brandon Vaughan

Examples

1
2
3
4
dat <- sim.student(
  n = 120, p = 200, rho = 0.26,
  coefs = c(runif(25, -4, -1), runif(25, 1, 4)), snr = 2,
  seed = 100)

abnormally-distributed/cvreg documentation built on May 3, 2020, 3:45 p.m.