generate_X: Simulate Gaussian and binary covariate predictors

Description Usage Arguments Details Value Examples

View source: R/simdata.R

Description

simulate Gaussian predictors with mean zero and covariance structure determined by "cov_type" argument. Then p_b randomly selected columns are dichotomized.

Usage

1
generate_X(n, p, p_b, cov_type, rho = 0.5)

Arguments

n

number of observations (rows of X)

p

total number of covariates (columns of X) both continuous and binary

p_b

number of binary covariates (0 <= p_b <= p)

cov_type

character string specifying the covariance function. Can be one of "cov_diag" (independent columns), "cov_equi" (equi-correlated columns), or "cov_ar1" (ar1-correlated columns). The columns are shuffled during simulation

rho

correlation parameter; input to the cov_type function

Details

This function simulates a data frame, whose rows are multivariate Gaussian with mean zero and covariance structure determined by "cov_type" argument. Then p_b randomly selected columns are dichotomized with the function 1(x>0). The continuous columns are of class "numeric" and the binary columns are set to class "factor".

Value

the simulated data.frame with n rows and p columns (p_b of which are binary and p-p_b of which are gaussian). Each column is either of class "numeric" or "factor".

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
library(seqknockoff)

# all columns are continuous:
X <- generate_X(n=100, p=6, p_b=0, cov_type="cov_equi", rho=0.5)

round(cor(X), 2)

# two of the six columns are dichotomized (and set to class factor):
X <- generate_X(n=100, p=6, p_b=2, cov_type="cov_equi", rho=0.5)

# The class of each column:
unlist(lapply(X, class))

kormama1/seqknockoff documentation built on April 11, 2021, 7:44 a.m.