GendataGP: Generate simulation data (Complete data with group...

View source: R/GendataGP.R

GendataGPR Documentation

Generate simulation data (Complete data with group predictors)

Description

In many regression problems, some predictors may be naturally grouped. The most common example that contains group variables is the multifactor analysis of variance (ANOVA) problem, where each factor may have several levels and can be expressed through a group of dummy variables. This function helps you quickly generate simulation data with group predictors. You just need to input the sample and dimension of the data you want to generate and the covariance parameter rho. This simulated example comes from Example 2 introduced by Li et al.(2012)

Usage

GendataGP(n, p, rho, error = c("gaussian", "t", "cauchy"))

Arguments

n

Number of subjects in the dataset to be simulated. It will also equal to the number of rows in the dataset to be simulated, because it is assumed that each row represents a different independent and identically distributed subject.

p

Number of predictor variables (covariates) in the simulated dataset. These covariates will be the features screened by model-free procedures.

rho

The correlation between adjacent covariates in the simulated matrix X. The within-subject covariance matrix of X is assumed to has the same form as an AR(1) auto-regressive covariance matrix, although this is not meant to imply that the X covariates for each subject are in fact a time series. Instead, it is just used as an example of a parsimonious but nontrivial covariance structure. If rho is left at the default of zero, the X covariates will be independent and the simulation will run faster.

error

The distribution of error term.

Value

the list of your simulation data

Author(s)

Xuewei Cheng xwcheng@hunnu.edu.cn

References

Li, R., W. Zhong, and L. Zhu (2012). Feature screening via distance correlation learning. Journal of the American Statistical Association 107(499), 1129–1139.

Examples

n <- 100
p <- 200
rho <- 0.5
data <- GendataGP(n, p, rho, "gaussian")


MFSIS documentation built on June 22, 2024, 9:42 a.m.