pboost: Profile Boosting Framework

View source: R/pboost.R

pboostR Documentation

Profile Boosting Framework

Description

pboost is the generic workhorse function of profile boosting framework for parametric regression.

Usage

pboost(
  formula,
  data,
  fitFun,
  scoreFun,
  stopFun,
  ...,
  keep = NULL,
  maxK = NULL,
  verbose = FALSE
)

Arguments

formula

An object of class formula of the form LHS ~ RHS, where the right-hand side (RHS) specifies the candidate features for the linear predictor \eta = \sum_j \beta_j x_j.

The following restrictions and recommendations apply:

  • All variables appearing on the RHS must be numeric in the supplied data

  • For computational efficiency, each term on the RHS must correspond to a single column in the resulting model matrix. Supported expressions include main effects (x1), interactions (x1:x2), and simple transformations (log(x1), I(x1^2), etc.). Complex terms that expand into multiple columns—such as poly(x, degree), bs(x), or ns(x)—are not supported.

  • Offset terms should not be included in the formula. Instead, provide them via the dedicated offset argument of fitFun.

data

An data frame containing the variables in the model.

fitFun

Function to fit the empirical risk function in the form fitFun(formula, data, ...).

scoreFun

Function to compute the derivative of empirical risk function in the form scoreFun(object), where object is returned by fitFun. scoreFun should return a vector with the same length of y in data.

stopFun

Stopping rule for profile boosting, which has the form stopFun(object) to evaluate the performance of model object returned by fitFun, such as EBIC or BIC.

...

Additional arguments to be passed to fitFun.

keep

Initial set of features that are included in model fitting. If keep is specified, it should also be fully included in the RHS of formula.

maxK

Maximal number of identified features. If maxK is specified, it will suppress stopFun, saying that the profile boosting continues until the procedure identifies maxK features. The pre-specified features in keep are counted toward maxK.

verbose

Print the procedure path?

Value

Model object fitted on the selected features.

Examples

set.seed(2025)
n <- 200
p <- 300
x <- matrix(rnorm(n*p), n)
eta <- drop(x[, 1:3] %*% runif(3, 1.0, 1.5))
y <- rbinom(n, 1, 1/(1+exp(-eta)))
DF <- data.frame(y, x)

scoreLogistic <- function(object) {
    eta.hat <- object[["linear.predictors"]]
    return(object[["y"]] - 1/(1+exp(-eta.hat)))
}

( result <- pboost(y~., DF, glm, scoreLogistic, EBIC, family="binomial") )

attr(terms(formula(result), data=DF), "term.labels")


pboost documentation built on Jan. 9, 2026, 1:07 a.m.