gg_partialpro: Split varpro partial dependence data into continuous or...

View source: R/gg_partialpro.R

gg_partialproR Documentation

Split varpro partial dependence data into continuous or categorical datasets

Description

Takes the list returned by varpro::partialpro and separates variables into two data frames: one for continuous predictors (parametric, non- parametric, and causal effect curves) and one for categorical predictors (one row per observation per category level).

Usage

gg_partialpro(part_dta, nvars = NULL, cat_limit = 10, model = NULL)

Arguments

part_dta

partial plot data from varpro::partialpro. Each element of the list must contain fields xvirtual, xorg, yhat.par, yhat.nonpar, and yhat.causal.

nvars

how many variables (list elements) to process. Defaults to all variables in part_dta.

cat_limit

Variables with length(xvirtual) \le cat_limit are treated as categorical. Default 10.

model

a label applied to all rows. Useful when combining results from multiple models in a single figure.

Details

The split is governed by cat_limit: a variable is treated as continuous when length(xvirtual) > cat_limit; otherwise it is treated as categorical and the per-category rows are stacked.

Value

A named list with two elements:

continuous

data.frame with columns variable, parametric, nonparametric, causal, name (and optionally model)

categorical

data.frame with the same columns but one row per observation per category level

See Also

gg_partial varpro_feature_names

Examples

## Construct mock varpro partialpro output:
##   - "age": a continuous predictor (xvirtual has > 10 points)
##   - "sex": a categorical predictor (xvirtual has 2 points)
set.seed(42)
n_obs <- 30   # number of observations (rows in yhat matrices)
n_pts <- 15   # number of evaluation points for continuous variables

mock_data <- list(
  age = list(
    # xvirtual: evaluation grid for the marginal effect curve
    xvirtual   = seq(30, 80, length.out = n_pts),
    # xorg: original observed values (used only for categorical detection)
    xorg       = sample(seq(30, 80, by = 5), n_obs, replace = TRUE),
    # yhat matrices: n_obs rows x n_pts columns (predictions at each grid pt)
    yhat.par   = matrix(rnorm(n_obs * n_pts), nrow = n_obs),
    yhat.nonpar = matrix(rnorm(n_obs * n_pts), nrow = n_obs),
    yhat.causal = matrix(rnorm(n_obs * n_pts), nrow = n_obs)
  ),
  sex = list(
    # Two categories: the xvirtual grid has only 2 points
    xvirtual   = c(0, 1),
    xorg       = sample(c(0, 1), n_obs, replace = TRUE),
    # Two-column yhat matrices (one column per category)
    yhat.par   = matrix(rnorm(n_obs * 2), nrow = n_obs),
    yhat.nonpar = matrix(rnorm(n_obs * 2), nrow = n_obs),
    yhat.causal = matrix(rnorm(n_obs * 2), nrow = n_obs)
  )
)

result <- gg_partialpro(mock_data)

## Continuous result: one row per evaluation grid point
head(result$continuous)

## Categorical result: n_obs rows per category level
head(result$categorical)


ggRandomForests documentation built on May 2, 2026, 5:06 p.m.