knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5 )
set.seed(89) library(ggplot2) library(cjbart)
This vignette provides a brief example of how to estimate heterogeneous effects in conjoint experiments, using \code{cjbart}.
We first simulate a basic conjoint design for the purpose of illustration. Suppose we conducted a conjoint experiment on 250 individuals, where each individual makes 5 discrete choices between two profiles. Each profile has five attributes (A-E), and we also record two covariates for each subject. The resulting data contains 2500 observations.
subjects <- 250 rounds <- 5 profiles <- 2 obs <- subjects*rounds*profiles fake_data <- data.frame(A = sample(c("a1","a2","a3"), obs, replace = TRUE), B = sample(c("b1","b2","b3"), obs, replace = TRUE), C = sample(c("c1","c2","c3"), obs, replace = TRUE), D = sample(c("d1","d2","d3"), obs, replace = TRUE), E = sample(c("e1","e2","e3"), obs, replace = TRUE), covar1 = rep(runif(subjects, 0 ,1), each = rounds), covar2 = rep(sample(c(1,0), subjects, replace = TRUE), each = rounds), id1 = rep(1:subjects, each=rounds), stringsAsFactors = TRUE) fake_data$Y <- ifelse(fake_data$E == "e2", rbinom(obs, 1, fake_data$covar1), sample(c(0,1), obs, replace = TRUE))
To estimate a conjoint model on this data we will use the cjbart()
function, which uses Bayesian Additive Regression Trees (BART) to approximate the relationships between the outcome (a binary choice), the randomized attribute-levels, and the included covariates. The result is a modified pbart
model, which we will use to estimate heterogeneity in the treatment effects:
cj_model <- cjbart(data = fake_data, Y = "Y", id = "id1")
To generate heterogeneity estimates at the observation- and individual-level, we use the IMCE()
function. We pass in both the original experimental data, and the conjoint model generated in the last step. At this point, we also declare the names of the conjoint attribute variables (attribs
) and the reference category we want to use for each attribute when calculating the marginal component effects (ref_levels
). The vector of reference levels must be of the same length as attribs
, with the level at position i
corresponding to the attrib[i]
:
het_effects <- IMCE(data = fake_data, model = cj_model, attribs = c("A","B","C","D","E"), ref_levels = c("a1","b1","c1","d1","e1"), cores = 1)
IMCE()
returns an object of class "cjbart", which contains the individual-level marginal component effects (IMCEs), matrices reflecting the upper and lower values of each estimate's confidence interval respectively, and optionally the underlying observation-level marginal component effects (OMCEs) when keep_omces = TRUE
.
It is likely that users will want to compare the conjoint estimation strategy to other estimation strategies, such as logistic regression models. We can recover the average marginal component effect (AMCE) for the BART model using the summary()
command:
summary(het_effects)
We can plot the IMCEs, color coding the points by some covariate value, using the in-built plot()
function:
plot(het_effects, covar = "covar1")
To aid presentation, the plot function can restrict which attribute-levels are displayed by using the plot_levels
argument:
plot(het_effects, covar = "covar1", plot_levels = c("a2","a3","e2","e3"))
We can estimate the importance of covariates to the model using the het_vimp()
function, which calculates standardized importance scores using random forest-based permutation tests. Calling plot()
on the result will return a heatmap of these scores for each combination of attribute-level and covariate (users can restrict which attribute levels and covariates are considered, see the documentation for more details):
vimp_estimates <- het_vimp(imces = het_effects, cores = 1) plot(vimp_estimates)
Supplying a single attribute-level, the same plot function will display the importance estimates with corresponding 95 percent confidence intervals:
plot(vimp_estimates, att_levels = "d3")
Finally, it is possible to estimate population IMCEs (pIMCEs) using pIMCE()
This function requires a list of the marginal probabilities for each attribute in the population of interest, and can only be estimated for one attribute-level comparison at a time (due to the computational requirements):
fake_marginals <- list() fake_marginals[["A"]] <- c("a1" = 0.33,"a2" = 0.33,"a3"=0.34) fake_marginals[["B"]] <- c("b1" = 0.33,"b2" = 0.33,"b3" = 0.34) fake_marginals[["C"]] <- c("c1" = 0.33,"c2" = 0.33, "c3" = 0.34) fake_marginals[["D"]] <- c("d1" = 0.75,"d2" = 0.2,"d3" = 0.05) fake_marginals[["E"]] <- c("e1" = 0.33,"e2" = 0.33,"e3" = 0.34) # Reduced number of covariate data for sake of speed fake_pimces <- pIMCE(covar_data = fake_data[fake_data$id1 %in% 1:3, c("id1","covar1","covar2")], model = cj_model, attribs = c("A","B","C","D","E"), l = "E", l_1 = "e2", l_0 = "e1", marginals = fake_marginals, method = "bayes", cores = 1) head(fake_pimces)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.