Sim_GCRdata: Sim_GCRdata

View source: R/Sim_GCRdata.R

Sim_GCRdataR Documentation

Sim_GCRdata

Description

Simulate data from the Gaussian Copula Regression model

Usage

Sim_GCRdata(
  m,
  n,
  p,
  pFixed,
  responseType,
  extras = list(rhoY = 0.8, rhoX = 0.7, fixedEffect = rep(1, pFixed), muEffect = rep(0,
    m), varEffect = rep(1, m), s1Lev = 0.1, s2Lev = 1, sdResponse = rep(1, m), nCat =
    NULL, cutPoint = NULL, nTrial = NULL, negBinomPar = NULL),
  std = TRUE,
  seed = 31122021
)

Arguments

m

Integer, number of responses

n

Integer, number of samples

p

Integer, number of predictors

pFixed

Integer, number of covariates (including the intercept). If pFixed = 1 only the intercept will be simulated

responseType

m-dimensional character vector specifying the response's type. Response's types currently supported are c("Gaussian", "binary", "ordinal", "binomial", "negative binomial")

extras

List specifying response-specific arguments. These are the one-lag correlation rhoY in the autoregressive model used to simulate the responses, the correlation rhoX between the predictors (for both, see details in \insertCiteRothman2010;textualBVS4GCR), fixedEffect the vector of the regression coefficients for the fixed predictors (if simulated) where the first element is assumed to be the intercept, the mean muEffect and variance varEffect of the Gaussian distribution used to simulate the regression coefficients, the probabilities s1Lev and s2Lev utilized to induce sparsity on the regression coefficients, the standard deviation sdResponse of the response variables, the number of categories nCat of each ordinal categorical response (if simulated), the cut-off points cutPoint used to simulate the ordinal variables (if simulated), nTrial and negBinomPar the numer of trials in the binomial experiment and the over-dispersion parameter of the negative binomial distribution. See also the examples below and details in \insertCiteAlexopoulos2021;textualBVS4GCR

std

Logical parameter to standardise the predictors before the analysis. Default value is set at TRUE

seed

Seed used to initialise the simulation

Details

The parameters s1Lev and s2Lev in the extras list can be specified following the fact that ((1 - s2Lev) * p) predictors are irrelevant for all m responses and that each relevant predictor is common across (s1Lev * m) predictors \insertCiteRothman2010BVS4GCR. Type of responses currently supported: (continuous) Gaussian, (discrete) binary, ordinal categorical and count (binomial and negative binomial distributions)

Value

The value returned is a list object list(Y, X, B, R, par)

  • Y n times m matrix of responses

  • X n times (pFixed + p) matrix of covariates (including the intercept if simulated) and predictors on which Bayesian Variable Selection is performed

  • B m times (pFixed + p) matrix of regression coefficients

  • R m times m correlation matrix used to simulate the responses

  • extra List of all parameters used in the simulation list(rhoY, rhoX, fixedEffect, muEffect, varEffect, s1Lev, s2Lev, sdResponse, nCat, cutPoint, nTrial, negBinomPar, std, seed)

References

\insertAllCited

Examples

# Example 1: Simulate a combination of Gaussian, binary and ordinal responses

d <- Sim_GCRdata(m = 4, n = 500, p = 30, pFixed = 1, responseType = c("Gaussian", "binary", 
                 "ordinal", "ordinal"), 
                 extras = list(rhoY = 0.8, rhoX = 0.7, fixedEffect = 1, muEffect = 
                               c(1, rep(0.5, 3)), varEffect = c(1, rep(0.2, 3)), s1Lev = 0.15, 
                               s2Lev = 0.95, sdResponse = rep(1, 4), nCat = c(3, 4), cutPoint = 
                               cbind(c(0.5, NA, NA), c(0.5, 1.2, 2))), 
                 seed = 28061971)

par(mfrow = c(2, 2))
hist(d$Y[,1], main = "Gaussian", xlab = expression(Y[1]), prob = TRUE, ylab = "")
barplot(prop.table(table(d$Y[, 2])), main = "binary", xlab = expression(Y[2]))
barplot(prop.table(table(d$Y[, 3])), main = "ordinal (3 categories)", xlab = expression(Y[3]))
barplot(prop.table(table(d$Y[, 4])), main = "ordinal (4 categories)", xlab = expression(Y[4]))

par(mfrow = c(2, 2))
plot(d$B[, 1], ylab=expression(beta[1]), main=expression(Y[1]), xlab="Pred.", pch=16, cex=.7)
plot(d$B[, 2], ylab=expression(beta[2]), main=expression(Y[2]), xlab="Pred.", pch=16, cex=.7)
plot(d$B[, 3], ylab=expression(beta[3]), main=expression(Y[3]), xlab="Pred.", pch=16, cex=.7)
plot(d$B[, 4], ylab=expression(beta[4]), main=expression(Y[4]), xlab="Pred.", pch=16, cex=.7)


# Example 2: As in Example 1 with more categories for the forth response 

d <- Sim_GCRdata(m = 4, n = 500, p = 30, pFixed = 1, responseType = c("Gaussian", "binary", 
                 "ordinal", "ordinal"), 
                 extras = list(rhoY = 0.8, rhoX = 0.7, fixedEffect = 1, muEffect = 
                               c(1, rep(0.5, 3)), varEffect = c(1, rep(0.2, 3)), s1Lev = 0.15, 
                               s2Lev = 0.95, sdResponse = rep(1, 4), nCat = c(3, 5), cutPoint = 
                               cbind(c(0.5, 1, NA, NA), c(0.5, 0.7, 1.5, 2))), 
                 seed = 28061971)

par(mfrow = c(2, 2))
hist(d$Y[, 1], main = "Gaussian", xlab = expression(Y[1]), prob = TRUE, ylab = "")
barplot(prop.table(table(d$Y[, 2])), main = "binary", xlab = expression(Y[2]))
barplot(prop.table(table(d$Y[, 3])), main = "ordinal (3 categories)", xlab = expression(Y[3]))
barplot(prop.table(table(d$Y[, 4])), main = "ordinal (5 categories)", xlab = expression(Y[4]))


# Example 3: Simulate a combination of Gaussian and count responses

d <- Sim_GCRdata(m = 4, n = 1000, p = 100, pFixed = 2, responseType = c("Gaussian", "binomial", 
                 "negative binomial", "negative binomial"), 
                 extras = list(rhoY = 0.8, rhoX = 0.7, fixedEffect = c(-0.5, 0.5), muEffect = 
                 c(1, rep(0.5, 3)), varEffect = c(1, rep(0.2, 3)), s1Lev = 0.05, s2Lev = 0.95, 
                 sdResponse = rep(1, 4), nTrial = 10, negBinomPar = c(0.5, 0.75)), 
                 seed = 28061971)

par(mfrow = c(2, 2))
hist(d$Y[,1], main = "Gaussian", xlab = expression(Y[1]), prob = TRUE, ylab = "")
hist(table(d$Y[, 2]), main = "binomial", xlab = expression(Y[2]))
hist(table(d$Y[, 3]), main = "negative binomial", xlab = expression(Y[3]))
hist(table(d$Y[, 4]), main = "negative binomial", xlab = expression(Y[4]))


lb664/BVS4GCR documentation built on Feb. 6, 2023, 11:21 p.m.