partial: Better, nicer, friendlier partial dependence plots

View source: R/partials.R

partialR Documentation

Better, nicer, friendlier partial dependence plots

Description

Partial dependence plots show the response curves of an individual variable in the sum-of-trees models. The main line is the average of partial dependence plots for each posterior draw of sum-of-trees models; each of those curves is generated by evaluating the BART model prediction at each specified x value for *each other combination of other x values in the data*. This is obviously computationally very expensive, and gets slower to run depending on: how much smooth you add, how many variables you ask for, and more posterior draws (ndpost; defaults to 1000) in the bart() function.

Usage

partial(
  model,
  x.vars = NULL,
  equal = TRUE,
  smooth = 1,
  ci = TRUE,
  ciwidth = 0.95,
  trace = TRUE,
  transform = TRUE,
  panels = FALSE
)

Arguments

model

A dbarts model object

x.vars

A list of the variables for which you want to run the partials. Defaults to doing all of them.

equal

Spacing x levels equally instead of using quantiles, which is how dbarts does this normally (the distribution of points reflects the distribution of samples in the data - this makes weird patterns that don't look very smooth)

smooth

A multiplier for how much smoother you want the sampling of the levels to be. High values, like 10 or over, are obviously much slower and don't add much.

ci

Plot a given % credible interval with a blue bar. Defaults to 95% and controlled by ciwidth

ciwidth

Specify the width of the plotted credible issue

trace

Traceplots for each individual draw from the posterior

transform

This converts from the logit output of dbarts:::predict to actual 0 to 1 probabilities. I wouldn't turn this off unless you're really interested in a deep dive on the model.

panels

For multiple variables, use this to create a multipanel figure.

Value

Returns a ggplot object or cowplot object.

Examples

f <- function(x) { return(0.5 * x[,1] + 2 * x[,2] * x[,3]) - 5*x[,4] }
sigma <- 0.2
n <- 100
x <- matrix(2 * runif(n * 3) -1, ncol = 3)
x <- data.frame(x)
x[,4] <- rbinom(100, 1, 0.3)
colnames(x) <- c('rob', 'hugh', 'ed', 'phil')
Ey <- f(x)
y  <- rnorm(n, Ey, sigma)
df <- data.frame(y, x)
set.seed(99)

bartFit <- bart(y ~ rob + hugh + ed + phil, df,
               keepevery = 10, ntree = 100, keeptrees = TRUE)

partial(bartFit, x.vars='hugh', trace=TRUE, ci=TRUE)
partial(bartFit, x.vars='hugh', equal=TRUE, trace=TRUE, ci=TRUE)
partial(bartFit, x.vars='hugh', equal=TRUE, smooth=10, trace=TRUE, ci=TRUE)

partial(bartFit, x.vars='rob', equal=TRUE, smooth=10, trace=FALSE, ci=TRUE)
partial(bartFit, x.vars='ed', equal=TRUE, smooth=10, trace=TRUE, ci=FALSE)
partial(bartFit, equal=TRUE, smooth=10, trace=FALSE, ci=TRUE, panels=TRUE)


cjcarlson/embarcadero documentation built on Sept. 9, 2023, 10:47 p.m.