hdpGLM_simulateData: Simulate a Data Set from hdpGLM

hdpGLM_simulateDataR Documentation

Simulate a Data Set from hdpGLM

Description

Simulate a Data Set from hdpGLM

Usage

hdpGLM_simulateData(
  n,
  K,
  nCov = 2,
  nCovj = 0,
  J = 1,
  family = "gaussian",
  parameters = NULL,
  pi = NULL,
  same.K = FALSE,
  seed = NULL,
  context.effect = NULL,
  same.clusters.across.contexts = NULL,
  context.dependent.cluster = NULL
)

Arguments

n

integer, the sample size of the data. If there are multiple contexts, each context will have n cases.

K

integer, the number of clusters. If there are multiple contexts, K is the average number of clusters across contexts, and each context gets a number of clusters sampled from a Poisson distribution, except if same.K is TRUE.

nCov

integer, the number of covariates of the GLM components.

nCovj

an integer indicating the number of covariates determining the average parameter of the base measure of the Dirichlet process prior

J

an integer representing the number of contexts @param parameters either NULL or a list with the parameters to generate the model. If not NULL, it must contain a sublist name beta, a vector named tau, and a vector named pi. The sublist beta must be a list of vectors, each one with size nCov+1 to be the coefficients of the GLM mixtures components that will generate the data. For the vector tau, if nCovj=0 (single-context case) then it must be a 1x1 matrix containing 1. If nCovj>0, it must be a (nCov+1)x(nCovj+1) matrix. The vector pi must add up to 1 and have length K.

family

a character with either 'gaussian', 'binomial', or 'multinomial'. It indicates the family of the GLM components of the mixture model.

parameters

a list with the parameter values of the model. Format should be the same of the output of the function hdpGLM_simulateParameters()

pi

either NULL or a vector with length K that add up to 1. If not NULL, it determines the mixture probabilities

same.K

boolean, used when data is sampled from more than one context. If TRUE all contexts get the same number of clusters. If FALSE, each context gets a number of clusters sampled from a Poisson distribution with expectation equals to K (current not implemented)

seed

a seed for set.seed

context.effect

either NULL or a two dimensional integer vector. If it is NULL, all the coefficients (beta) of the individual level covariates are functions of context-level features (tau). If it is not NULL, the first component of the vector indicates the index of the lower level covariate (X) whose linear effect beta depends on context (tau) (0 is the intercept). The second component indicates the index context-level covariate (W) whose linear coefficient (tau) is non-zero.

same.clusters.across.contexts

boolean, if TRUE all the contexts will have the same number of clusters AND each cluster will have the same coefficient beta.

context.dependent.cluster

integer, indicates which cluster will be context-dependent. If zero, all clusters will be context-dependent


hdpGLM documentation built on Oct. 13, 2023, 1:17 a.m.