Description Usage Arguments Details Value Examples
This function takes in a training data.frame and optional testing data.frame and performs posterior sampling. It returns posterior predictions and posterior clustering for training and test sets. The function is built for zero-inflated, but otherwise continuous, outcomes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ZDPMix(
d_train,
formula,
d_test = NULL,
burnin = 100,
iter = 1000,
phi_y = c(shape = 5, rate = 1000),
beta_prior_mean = NULL,
beta_prior_var = NULL,
gamma_prior_mean = NULL,
gamma_prior_var = NULL,
init_k = 10,
beta_var_scale = 1000,
mu_scale = 1,
tau_scale = 1,
prop_sigma_z = diag(rep(0.025, nparams))
)
|
d_train |
A |
formula |
Specified in the usual way, e.g. for |
d_test |
Optional |
burnin |
integer specifying number of burn-in MCMC draws. |
iter |
integer greater than |
phi_y |
Optional. Length two |
beta_prior_mean |
Optional. If there are |
beta_prior_var |
Optional. If there are |
gamma_prior_mean |
Optional. If there are |
gamma_prior_var |
Optional. If there are |
init_k |
Optional. integer specifying the initial number of clusters to kick off the MCMC sampler. |
beta_var_scale |
Optional. A multiplicative constant that scales |
mu_scale |
Optional. An numeric, scalar constant that controls how widely distributed new cluster continuous covariate means are distributed around the empirical covariate mean. Specifically, all continuous covariates are assumed to have Gaussian likelihood with Gaussian prior on their means. |
tau_scale |
Optional. An numeric, scalar constant that controls how widely distributed new cluster continuous covariate variances are distributed around the empirical variance. Specifically, all continuous covariates are assumed to have Gaussian likelihood with Inverse Gamma prior on their variance. |
prop_sigma_z |
Optional. If you specified |
Please see https://stablemarkets.github.io/ChiRPsite/index.htmlfor examples and detailed model and parameter descriptions.
Please see https://arxiv.org/abs/1810.09494 for a methodological reference.
Returns predictions$train and cluster_inds$train. predictions$train returns an nrow(d_train) by iter - burnin matrix of posterior predictions. cluster_inds$train returns an nrow(d_train) by iter - burnin matrix of cluster assignment indicators, which can be input into the function cluster_assign_mode() to compute posterior mode assignment. predictions$test and cluster_inds$test are returned if d_test is specified.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | set.seed(1)
n<-200 ## generate from clustered, skewed, data distribution
X11 <- rnorm(n = n, mean = 10, sd = 3)
X12 <- rnorm(n = n, mean = 0, sd = 2)
X13 <- rnorm(n = n, mean = -10, sd = 4)
Y1 <- rnorm(n = n, mean = 100 + .5*X11, 20)*(1-rbinom(n, 1, prob = pnorm( -10 + 1*X11 ) ))
Y2 <- rnorm(n = n, mean = 200 + 1*X12, 30)*(1-rbinom(n, 1, prob = pnorm( 1 + .05*X12 ) ))
Y3 <- rnorm(n = n, mean = 300 + 2*X13, 40)*(1-rbinom(n, 1, prob = pnorm( -3 -.2*X13 ) ))
d <- data.frame(X1=c(X11, X12, X13), Y = c(Y1, Y2, Y3))
d$X1 <- scale(d$X1)
ids <- sample(1:600, size = 500, replace = FALSE )
d_train <- d[ids,]
d_test <- d[-ids, ]
res <- ChiRP::ZDPMix(d_train = d_train, d_test = d_test, formula = Y ~ X1,
burnin=100, iter=200, init_k = 5, phi_y = c(10, 10000))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.