varyknotsIC: Information criterion for spline regression with a variable...

View source: R/gareg_knots.R

varyknotsICR Documentation

Information criterion for spline regression with a variable number of knots

Description

Evaluates an information criterion (BIC, AIC, or AICc) for a regression of y on a spline basis of x where the number and locations of interior knots are encoded in the chromosome. Designed for use as a GA objective/fitness function. The spline basis is constructed via [splineX()].

Usage

varyknotsIC(
  knot_bin,
  plen = 0,
  y,
  x,
  x_unique,
  x_base = NULL,
  degree = 3L,
  type = c("ppolys", "ns", "bs"),
  intercept = TRUE,
  ic_method = "BIC"
)

Arguments

knot_bin

Integer vector (chromosome). Gene 1 stores m, the number of interior knots. Genes 2:(1+m) are indices into x_unique selecting the m interior knots, followed by a sentinel equal to length(x_unique)+1. Only genes strictly before the first occurrence of length(x_unique)+1 are treated as interior indices; genes after the sentinel are ignored. Interior indices must be in 2:(length(x_unique)-1), finite, and non-duplicated.

plen

Unused placeholder kept for API compatibility; ignored.

y

Numeric response vector of length n.

x

Numeric predictor (same length as y) on which the spline basis is constructed.

x_unique

Optional numeric vector of unique candidate knot locations. If missing or NULL, defaults to sort(unique(x)). Must have at least three values (two boundaries + one interior) to allow any knots.

x_base

Optional matrix (or vector) of additional covariates to include linearly alongside the spline basis; coerced to a matrix if supplied.

degree

Integer polynomial degree for type="ppolys" and type="bs" (default 3L). Ignored for type="ns" (always cubic).

type

One of c("ppolys","ns","bs"); forwarded to [splineX()].

intercept

Logical; forwarded to [splineX()]. For m>0, the spline block is splineX(..., intercept=intercept) and no explicit 1-column is added here; for m=0, an explicit intercept is added via cbind(1, x_base). Set intercept=FALSE if you plan to add your own 1-column.

ic_method

Which information criterion to return: "BIC", "AIC", or "AICc".

Details

If m = 0, the model is a pure-linear baseline using only an intercept and x_base: X <- cbind(1, x_base) (no spline terms). For m > 0, the spline block is built with [splineX()] using the selected interior knots, with X <- cbind(splineX(..., intercept=intercept), x_base).

The criteria are computed as:

\mathrm{BIC} = n \log(\mathrm{SSRes}/n) + p \log n,

\mathrm{AIC} = n \log(\mathrm{SSRes}/n) + 2p,

\mathrm{AICc} = n \log(\mathrm{SSRes}/n) + 2p + \frac{2p(p+1)}{n-p-1},

where \mathrm{SSRes} is the residual sum of squares and p is the number of columns in the design matrix X.

Value

A single numeric value: the requested information criterion (lower is better). Returns Inf for invalid chromosomes/inputs.

Note

This function allows m=0 (no spline terms) so that the GA can compare against a pure-linear baseline (intercept + x_base). Spacing constraints (e.g., minimum distance between indices) should be enforced by the GA operators or an external penalty.

See Also

[fixknotsIC()], [splineX()], bs, ns

Examples

## Example with 'mcycle' data (MASS)
# y <- mcycle$accel; x <- mcycle$times
# x_unique <- sort(unique(x))
# chrom <- c(5, 24, 30, 46, 49, 69, length(x_unique) + 1)
# varyknotsIC(chrom, y=y, x=x, x_unique=x_unique,
#             type="ppolys", degree=3, ic_method="BIC")

GAReg documentation built on March 29, 2026, 5:08 p.m.