nl_fit: Fit a nonlinear (spline or GAM) single-level or multilevel...

View source: R/nl_fit.R

nl_fitR Documentation

Fit a nonlinear (spline or GAM) single-level or multilevel model

Description

Fits a nonlinear regression model for an outcome y with a focal predictor x, modeled using natural cubic splines ("ns"), B-splines ("bs"), or GAM smooths ("gam").

Version 2 additions: two-way and nested clustering via the nested argument; random spline slopes via random_slope; B-spline basis via method = "bs"; automatic df selection via df = "auto".

Usage

nl_fit(
  data,
  y,
  x,
  time = NULL,
  cluster = NULL,
  nested = FALSE,
  controls = NULL,
  method = c("ns", "bs", "gam"),
  df = 4,
  df_range = 2:8,
  df_criterion = c("AIC", "BIC"),
  k = 5,
  bs_degree = 3,
  random_slope = FALSE,
  family = stats::gaussian(),
  ...
)

Arguments

data

A data frame (often long format for longitudinal data).

y

Outcome variable name (string).

x

Focal nonlinear predictor name (string). Must be numeric.

time

Optional time variable name (string).

cluster

Optional character vector of grouping variable name(s) for random effects, e.g. NULL, "id", or c("id", "school") for two-way clustering.

nested

Logical; only used when length(cluster) == 2. If TRUE, uses a nested specification (1 | g1/g2); if FALSE (default), uses cross-classified (1 | g1) + (1 | g2).

controls

Optional character vector of additional covariate names to include as linear fixed effects.

method

Spline basis to use: "ns" (natural cubic spline, default), "bs" (B-spline), or "gam" (GAM smooth via mgcv::gam()). Multilevel fits require "ns" or "bs".

df

Degrees of freedom for the spline basis. Supply a single integer greater than or equal to 1, or the string "auto" to trigger automatic selection by information criterion. Default 4.

df_range

Integer vector of candidate df values evaluated when df = "auto". Default 2:8.

df_criterion

Information criterion used for automatic df selection: "AIC" (default) or "BIC".

k

Basis dimension for mgcv::s() when method = "gam". Default 5.

bs_degree

Polynomial degree for splines::bs() when method = "bs". Default 3 (cubic).

random_slope

Logical; if TRUE and cluster is supplied, fits random spline slopes in addition to random intercepts. Currently implemented only for single-level clustering. Default FALSE.

family

A family object such as stats::gaussian() (default) or stats::binomial(). For multilevel fits, gaussian() uses lme4::lmer() and all other families use lme4::glmer().

...

Additional arguments passed to the underlying fitting function (stats::lm(), mgcv::gam(), lme4::lmer(), or lme4::glmer()).

Value

An object of class "nl_fit" (a named list). It contains the fitted model object, the method, variable names (y, x, time, cluster, controls), spline settings (df, df_selected, df_search, k, bs_degree), flags (nested, random_slope), the family, the model formula, the call, and metadata used for prediction (x_info, levels_info, control_defaults).

See Also

nl_predict, nl_derivatives, nl_compare, nl_r2, nl_knots

Examples

## Not run: 
# Single-level natural spline with automatic df selection
fit <- nl_fit(data = mydata, y = "score", x = "age", df = "auto")

# Two-way cross-classified clustering
fit2 <- nl_fit(
  data    = mydata,
  y       = "score",
  x       = "age",
  cluster = c("student_id", "school_id"),
  nested  = FALSE
)

# Nested clustering (students within schools)
fit3 <- nl_fit(
  data    = mydata,
  y       = "score",
  x       = "age",
  cluster = c("student_id", "school_id"),
  nested  = TRUE
)

# Random spline slopes
fit4 <- nl_fit(
  data         = mydata,
  y            = "score",
  x            = "age",
  cluster      = "id",
  random_slope = TRUE
)

## End(Not run)


MultiSpline documentation built on April 16, 2026, 9:06 a.m.