GenSynthetic: Generate Synthetic Data

View source: R/gensynthetic.R

GenSyntheticR Documentation

Generate Synthetic Data

Description

Generates a synthetic dataset as follows: 1) Sample every element in data matrix X from N(0,1). 2) Generate a vector B with the first k entries set to 1 and the rest are zeros. 3) Sample every element in the noise vector e from N(0,1). 4) Set y = XB + b0 + e.

Usage

GenSynthetic(n, p, k, seed, rho = 0, b0 = 0, snr = 1)

Arguments

n

Number of samples

p

Number of features

k

Number of non-zeros in true vector of coefficients

seed

The seed used for randomly generating the data

rho

The threshold for setting values to 0. if |X(i, j)| > rho => X(i, j) <- 0

b0

intercept value to translate y by.

snr

desired Signal-to-Noise ratio. This sets the magnitude of the error term 'e'. SNR is defined as SNR = Var(XB)/Var(e)

Value

A list containing: the data matrix X, the response vector y, the coefficients B, the error vector e, the intercept term b0.

Examples

data <- GenSynthetic(n=100,p=20,k=10,seed=1)
X = data$X
y = data$y

L0Learn documentation built on March 7, 2023, 8:18 p.m.