generate_y: Simulate Gaussian response from a sparse regression model

Description Usage Arguments Details Value Examples

View source: R/simdata.R

Description

Simulate Gaussian response from a sparse regression model

Usage

1
generate_y(X, p_nn, a)

Arguments

X

matrix corresponding to the regression design matrix. Numeric columns of X should have variance = 1/nrow(X), default behavior of generate_X.

p_nn

number of non-null covariate predictors. The regression coefficients (beta) corresponding to columns 1:p_nn of x will be non-zero, all other are set to zero.

a

amplitude of non-null regression coefficients

Details

This function takes as input data.frame X (created with the function generate_X) that may consist of both numeric and binary factor columns. This data frame is then expanded to a model matrix x (with the model.matrix function). The binary factor variables become dummy indicators that are then scaled by a 0.5*sqrt(nrow(X)) factor so that column-wise variance of the x is equal to 1/n. This makes sense as long as the variance of the numeric columns is also equal to 1/n (which it is if X is generated with the function generate_X). Next we simulate y ~ N(x the remaining coefficients (p_nn+1):ncol(x) are set to zero.

Value

simulated Gaussian response from regression model y = x x is the model.matrix of X and the binary dummy indicators of x have been scaled so variance = 1/nrow(X).

Examples

1
2
3
4
5
6
7
8
9
library(seqknockoff)

set.seed(1)

# Simulate 4 Gaussian and 2 binary covariate predictors:
X <- generate_X(n=100, p=6, p_b=2, cov_type="cov_equi", rho=0.5)

# Simulate response from model y = 2*X[,1] + 2*X[,2] + epsilon, where epsilon ~ N(0,1)
y <- generate_y(X, p_nn=2, a=2)

kormama1/seqknockoff documentation built on April 11, 2021, 7:44 a.m.