gg_partialpro: Split varpro partial dependence data into continuous or...
In ggRandomForests: Visually Exploring Random Forests

gg_partialpro

R Documentation

Split varpro partial dependence data into continuous or categorical datasets

Description

Takes the list returned by varpro::partialpro and separates variables into two data frames: one for continuous predictors (parametric, non- parametric, and causal effect curves) and one for categorical predictors (one row per observation per category level).

Usage

gg_partialpro(part_dta, nvars = NULL, cat_limit = 10, model = NULL)

Arguments

`part_dta`	partial plot data from `varpro::partialpro`. Each element of the list must contain fields `xvirtual`, `xorg`, `yhat.par`, `yhat.nonpar`, and `yhat.causal`.
`nvars`	how many variables (list elements) to process. Defaults to all variables in `part_dta`.
`cat_limit`	Variables with `length(xvirtual)` `\le` `cat_limit` are treated as categorical. Default `10`.
`model`	a label applied to all rows. Useful when combining results from multiple models in a single figure.

Details

The split is governed by cat_limit: a variable is treated as continuous when length(xvirtual) > cat_limit; otherwise it is treated as categorical and the per-category rows are stacked.

Value

A named list with two elements:

continuous: data.frame with columns variable, parametric, nonparametric, causal, name (and optionally model)
categorical: data.frame with the same columns but one row per observation per category level

Examples

## Construct mock varpro partialpro output:
##   - "age": a continuous predictor (xvirtual has > 10 points)
##   - "sex": a categorical predictor (xvirtual has 2 points)
set.seed(42)
n_obs <- 30   # number of observations (rows in yhat matrices)
n_pts <- 15   # number of evaluation points for continuous variables

mock_data <- list(
  age = list(
    # xvirtual: evaluation grid for the marginal effect curve
    xvirtual   = seq(30, 80, length.out = n_pts),
    # xorg: original observed values (used only for categorical detection)
    xorg       = sample(seq(30, 80, by = 5), n_obs, replace = TRUE),
    # yhat matrices: n_obs rows x n_pts columns (predictions at each grid pt)
    yhat.par   = matrix(rnorm(n_obs * n_pts), nrow = n_obs),
    yhat.nonpar = matrix(rnorm(n_obs * n_pts), nrow = n_obs),
    yhat.causal = matrix(rnorm(n_obs * n_pts), nrow = n_obs)
  ),
  sex = list(
    # Two categories: the xvirtual grid has only 2 points
    xvirtual   = c(0, 1),
    xorg       = sample(c(0, 1), n_obs, replace = TRUE),
    # Two-column yhat matrices (one column per category)
    yhat.par   = matrix(rnorm(n_obs * 2), nrow = n_obs),
    yhat.nonpar = matrix(rnorm(n_obs * 2), nrow = n_obs),
    yhat.causal = matrix(rnorm(n_obs * 2), nrow = n_obs)
  )
)

result <- gg_partialpro(mock_data)

## Continuous result: one row per evaluation grid point
head(result$continuous)

## Categorical result: n_obs rows per category level
head(result$categorical)

ggRandomForests documentation built on May 12, 2026, 5:07 p.m.