gen_GEE_data: Generate the datasets with clusters
In seqest: Sequential Method for Classification and Generalized Estimating Equations Problem

Description Usage Arguments Details Value References See Also Examples

gen_GEE_data generates the clustered data used for the generalized estimating equations with sequential method.

1
2
3

gen_GEE_data(numClusters, clusterSize, clusterRho, clusterCorstr, beta,
  family, intercept = TRUE, xCorstr = "ar1", xCorRho = 0.5,
  xVariance = 0.2)

`numClusters`	A numeric number represents the number of clusters we will generated. Note that each cluster has several similar subjects. It should be a integer.
`clusterSize`	A numeric number specifying the number of subjects in each cluster. The subject in the same cluster is highly correlated to each other which can be regarded as the longitudinal data.
`clusterRho`	A numeric parameter in correlation structure for the clusters. It will be ignored when responseCorstr is independence.
`clusterCorstr`	A character string specifying the correlation structure for the clusters. Allowed structures are: "independence", "exchangeable" and "ar1".
`beta`	A nummeric vector denotes the true parameter in GEE model.
`family`	The type of response data, matching one of 'gaussian()' or 'binomial()'. The 'gaussian()' corresponds to the continuous case and 'binomial' corresponds to the discrete case.
`intercept`	A logical value indicating whether to add intercept term. The default value is TRUE.
`xCorstr`	A character string specifying the correlation structure for the covariate. The default value is 'ar1'.
`xCorRho`	A numeric parameter indicating the correlation coefficient in covariables. It does something similar to what the argument clusterRho does. The default value is 0.5.
`xVariance`	A numeric number specifying the marginal variance in the correlation matrix in one clusters. The default value is 0.2.

The gen_GEE_data function is used to generate data. We can get data from two different distributions, corresponding to continuous and discrete cases. In the continuous case, the covariates vector x is created from a multivariate normal distribution with mean 0 and an AR(1) correlation matrix with autocorrelation coefficient and marginal variance. The value of autocorrelation coefficient and marginal variance are two arguments which we need specified. Then, the response y is generated by the equation: y = wx + e where the random error vector e follows a normal distribution with mean 0 and three different covariance structures with corresponding dimensional numbers. These three covariance matrices are the identity matrix, the exchangeable, and the AR(1) autoregressive correlation structure. In the discrete case, we use a logistic model. The covariates vectors x is the same as the continuous case. The binary response vector for each cluster has an AR(1) correlation structure with correlation coefficient alpha, and the marginal expectation u satisfies the following equation: logit(u) = wx

a list containing the following components

`x`	the covariate matrices. Note that the number of rows is numClusters * clusterSize and the number of columns is the length of beta + 1 if intercept is TRUE.
`y`	the response data which has the same number of rows to x
`clusterID`	the id for each sample. Note that the subjects in the same cluster will have identical id.

Chen, Z., Wang, Z., & Chang, Y. I. (2019). Sequential adaptive variables and subject selection for GEE methods. Biometrics. doi:10.1111/biom.13160

gen_multi_data for categorical and ordinal case

gen_bin_data for binary classification case.

initialSampleSize <-  75
clusterSize <-  5
responseCorstr <-  "ar1"
responseCorRho <-  0.3
response <-  gaussian()
beta0 <-  c(1, -1.1, 1.5, -2, rep(0, 50))
xVariance <-  0.2
xCorRho <-  0.5
xCorstr <-  "ar1"
data <- gen_GEE_data(numClusters = initialSampleSize,
                     clusterSize = clusterSize,
                     clusterCorstr = responseCorstr,
                     clusterRho = responseCorRho,
                     beta = beta0,
                     family = response,
                     intercept = TRUE,
                     xVariance = xVariance,
                     xCorstr = xCorstr,
                     xCorRho = xCorRho)

seqest documentation built on July 2, 2020, 2:28 a.m.

seqest index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

seqest
Sequential Method for Classification and Generalized Estimating Equations Problem

gen_GEE_data: Generate the datasets with clusters
In seqest: Sequential Method for Classification and Generalized Estimating Equations Problem

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to gen_GEE_data in seqest...

R Package Documentation

Browse R Packages

We want your feedback!

seqest Sequential Method for Classification and Generalized Estimating Equations Problem

gen_GEE_data: Generate the datasets with clusters In seqest: Sequential Method for Classification and Generalized Estimating Equations Problem

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to gen_GEE_data in seqest...

R Package Documentation

Browse R Packages

We want your feedback!

seqest
Sequential Method for Classification and Generalized Estimating Equations Problem

gen_GEE_data: Generate the datasets with clusters
In seqest: Sequential Method for Classification and Generalized Estimating Equations Problem