well_knotted_spline: Natural cubic spline with good choice of knots

View source: R/spline.R

well_knotted_splineR Documentation

Natural cubic spline with good choice of knots

Description

For use in model formulas, natural cubic spline as in splines::ns but with knot positions chosen using k-means rather than quantiles. Automatically uses less knots if there are insufficient distinct values.

Usage

well_knotted_spline(x, n_knots, verbose = TRUE)

Arguments

x

The predictor variable. A numeric vector.

n_knots

Number of knots to use.

verbose

If TRUE, produce a message about the knots chosen.

Details

Wong (1982, 1984) showed the asymptotic density of k-means in 1 dimension is proportional to the cube root of the density of x. Compared to using quantiles (the default for ns), choosing knots using k-means produces a better spread of knot locations if the distribution of values is very uneven.

k-means is computed in an optimal, deterministic way using Ckmeans.1d.dp.

Value

A matrix of predictors, similar to ns.

This function supports "safe prediction" (see makepredictcall). Original knot locations will be used for prediction with predict.

References

Wong, M. (1982). Asymptotic properties of univariate sample k-means clusters. Working paper #1341-82, Sloan School of Management, MIT. https://dspace.mit.edu/handle/1721.1/46876

Wong, M. (1984). Asymptotic properties of univariate sample k-means clusters. Journal of Classification, 1(1), 255–270. https://doi.org/10.1007/BF01890126

See Also

ns, makepredictcall

Examples

lm(mpg ~ well_knotted_spline(wt,3), data=mtcars)

# When insufficient unique values exist, less knots are used
lm(mpg ~ well_knotted_spline(gear,3), data=mtcars)

library(ggplot2)
ggplot(diamonds, aes(carat, price)) + 
   geom_point() + 
   geom_smooth(method="lm", formula=y~well_knotted_spline(x,10))


pfh/weitrix documentation built on Oct. 13, 2023, 1:01 p.m.