synthetic_data: synthetic_data

View source: R/helper_functions.R

synthetic_dataR Documentation

synthetic_data

Description

Generates synthetic linear and logistic regression data

Usage

synthetic_data(
  n,
  p,
  s0,
  error_std,
  type = "linear",
  scale = TRUE,
  signal = "constant"
)

Arguments

n

number of observations

p

number of covariates

s0

sparsity (number of non-zero components of the true signal)

error_std

Standard deviation of the Gaussian noise (linear regression only)

type

dataset type ('linear' or 'logistic')

scale

design matrix X has columns mean zero and standard deviation 1 (TRUE or FALSE)

signal

non-zero components of the true signal ('constant' or 'deacy')

Value

Design matrix, response and true signal vector for linear and logistic regression

Examples

syn_data <- synthetic_data(n=100,p=200,s0=5,error_std=2)

# syn_data$X is an n by p design matrix
dim(syn_data$X)

# syn_data$y is a length n response vector
length(syn_data$y) 

# syn_data$true_beta is a length n response vector with only the first s0 entries non-zero
all(syn_data$true_beta[1:5]!=0)
all(syn_data$true_beta[-c(1:5)]==0)

ScaleSpikeSlab documentation built on May 18, 2022, 5:18 p.m.