hdpGLM_simulateData: Simulate a Data Set from hdpGLM

hdpGLM_simulateDataR Documentation

Simulate a Data Set from hdpGLM


Simulate a Data Set from hdpGLM


  nCov = 2,
  nCovj = 0,
  J = 1,
  family = "gaussian",
  parameters = NULL,
  pi = NULL,
  same.K = FALSE,
  seed = NULL,
  context.effect = NULL,
  same.clusters.across.contexts = NULL,
  context.dependent.cluster = NULL



integer, the sample size of the data. If there are multiple contexts, each context will have n cases.


integer, the number of clusters. If there are multiple contexts, K is the average number of clusters across contexts, and each context gets a number of clusters sampled from a Poisson distribution, except if same.K is TRUE.


integer, the number of covariates of the GLM components.


an integer indicating the number of covariates determining the average parameter of the base measure of the Dirichlet process prior


an integer representing the number of contexts @param parameters either NULL or a list with the parameters to generate the model. If not NULL, it must contain a sublist name beta, a vector named tau, and a vector named pi. The sublist beta must be a list of vectors, each one with size nCov+1 to be the coefficients of the GLM mixtures components that will generate the data. For the vector tau, if nCovj=0 (single-context case) then it must be a 1x1 matrix containing 1. If nCovj>0, it must be a (nCov+1)x(nCovj+1) matrix. The vector pi must add up to 1 and have length K.


a character with either 'gaussian', 'binomial', or 'multinomial'. It indicates the family of the GLM components of the mixture model.


a list with the parameter values of the model. Format should be the same of the output of the function hdpGLM_simulateParameters()


either NULL or a vector with length K that add up to 1. If not NULL, it determines the mixture probabilities


boolean, used when data is sampled from more than one context. If TRUE all contexts get the same number of clusters. If FALSE, each context gets a number of clusters sampled from a Poisson distribution with expectation equals to K (current not implemented)


a seed for set.seed


either NULL or a two dimensional integer vector. If it is NULL, all the coefficients (beta) of the individual level covariates are functions of context-level features (tau). If it is not NULL, the first component of the vector indicates the index of the lower level covariate (X) whose linear effect beta depends on context (tau) (0 is the intercept). The second component indicates the index context-level covariate (W) whose linear coefficient (tau) is non-zero.


boolean, if TRUE all the contexts will have the same number of clusters AND each cluster will have the same coefficient beta.


integer, indicates which cluster will be context-dependent. If zero, all clusters will be context-dependent

hdpGLM documentation built on Oct. 13, 2023, 1:17 a.m.