gen.data: Generate simulated data

View source: R/gen.data.R

gen.dataR Documentation

Generate simulated data

Description

Generate simulated data for sparse group linear model.

Usage

gen.data(
  n,
  m,
  d,
  s,
  s0,
  cor.type = 1,
  beta.type = 1,
  rho = 0.5,
  sigma1 = 1,
  sigma2 = 1,
  seed = 1
)

Arguments

n

The number of observations.

m

The number of groups of interest.

d

The group size of each group. Only even group structure is allowed here.

s

The number of important groups in the underlying regression model.

s0

The number of important variables in each important group.

cor.type

The structure of correlation. cor.type = 1 denotes the independence structure, where the covariance matrix has (i,j) entry equals I(i \neq j). cor.type = 2 denotes the exponential structure, where the covariance matrix has (i,j) entry equals rho^{|i-j|}. cor.type = 3 denotes the constant structure, where the non-diagonal entries of covariance matrix are rho and diagonal entries are 1.

beta.type

The structure of coefficients. beta.type = 1 denotes the homogenous setup, where each entry has the same magnitude. beta.type = 2 denotes the heterogeneous structure, where the coefficients are drawn from a normal distribution.

rho

A parameter used to characterize the pairwise correlation in predictors. Default is 0.5..

sigma1

The value controlling the strength of the gaussian noise. A large value implies strong noise. Default sigma1 = 1.

sigma2

The value controlling the strength of the coefficients. A large value implies large coefficients. Default sigma2 = 1.

seed

random seed. Default: seed = 1.

Value

A list object comprising:

x

Design matrix of predictors.

y

Response variable.

beta

The coefficients used in the underlying regression model.

group

The group index of each variable.

true.group

The important groups in the sparse group linear model.

true.variable

The important variables in the sparse group linear model.

Author(s)

Yanhang Zhang, Zhifan Li, Jianxin Yin.

Examples


# Generate simulated data
n <- 200
m <- 100
d <- 10
s <- 5
s0 <- 5
data <- gen.data(n, m, d, s, s0)
str(data)

ADSIHT documentation built on April 3, 2025, 9 p.m.