genData: Response and Covariate Data Generation

Description Usage Arguments Value Note Author(s) Examples

View source: R/genData.R

Description

Function to generate data that can be used to test Forward stagewise / Penalized Regression techniques. Currently marginally Gaussian and Poisson responses are possible.

Function is provided to allow the user simple data generation as sgee functions were designed for. Various parameters controlling aspects such as the response correlation, the covariate group structure, the marginal response distribution, and the signal to noise ratio for marginally gaussian responses are provided to allow a great deal of specificity over the kind of data that is generated.

Usage

1
2
3
4
genData(numClusters, clusterSize = 1, clusterRho = 0,
  clusterCorstr = "exchangeable", yVariance = NULL, xVariance = 1,
  numGroups = length(beta), groupSize = 1, groupRho = 0, beta = 0,
  numMainEffects = NULL, family = gaussian(), SNR = NULL, intercept = 0)

Arguments

numClusters

Number of clusters to be generated.

clusterSize

Size of each cluster.

clusterRho

Correlation parameter for response.

clusterCorstr

String indicating cluster Correlation structure. Parameter is fed to genCorMat, so all possible entries for genCorMat are allowed.

yVariance

Optional scalar value specifying the marginal response variance; overrides SNR.

xVariance

Scalar value indicating marginal variance of the covariates.

numGroups

Number of covariate groups to be generated. Default behavior is to generate groups of size 1 (effectively no groups). If covariate groups are desired, numGroups and groupSize must be given such that length(beta) equals numGroups * groupSize.

groupSize

Size of each group.

groupRho

Within group correlation parameter.

beta

Vector of coefficient values used to generate response.

numMainEffects

An integer indicating that the first numMainEffects terms in beta are to be treated as main effects and the remaining terms are pairwise interaction effects, which are in the same order as generated by model.matrix. Default value of NULL indicates no interaction terms are included. The use of numMainEffects overrides any covariate grouping structure provided by the user.

family

Marginal response family; currently gaussian() and poisson() are accepted.

SNR

Scalar value that allows fixing the signal to noise ratio as defined as the ratio of the (observed) variance in the linear predictor to the variance of the response conditioned on the covariates.

intercept

Scalar value indicating the true intercept value.

Value

List containing the generated response, y, the generated covariates, x, a vector identifying the responses clusters, clusterID, and a vector identifying the covariate groups, groupID.

Note

Function is ued to generate both the desired covariate structure and the desired response structure. To generate poisson responses, functions from the R package coupla are used.

Current implementation of interactions overwrites any previous grouping structure; that is the number of groups becomes p and the group sizes are set to 1.

Author(s)

Gregory Vaughan

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
## A resonse variance can be given,
dat1 <- genData(numClusters = 10,
                clusterSize = 4,
                clusterRho = .5,
                clusterCorstr = "exchangeable",
                yVariance = 1,
                xVariance = 1,
                numGroups = 5,
                groupSize = 4,
                groupRho = .5,
                beta = c(rep(1,8), rep(0,12)),
                family = gaussian(),
                intercept = 1)

## or the signal to noise ratio can be fixed
dat2 <- genData(numClusters = 10,
                clusterSize = 4,
                clusterRho = .5,
                clusterCorstr = "exchangeable",
                xVariance = 1,
                numGroups = 5,
                groupSize = 4,
                groupRho = .5,
                beta = c(rep(1,8), rep(0,12)),
                family = poisson(),
                SNR = 10,
                intercept = 1)

sgee documentation built on May 1, 2019, 7:10 p.m.