well_knotted_spline: Natural cubic spline with good choice of knots

Description Usage Arguments Details Value References See Also Examples

View source: R/spline.R

Description

For use in model formulas, natural cubic spline as in splines::ns but with knot positions chosen using k-means rather than quantiles. Automatically uses less knots if there are insufficient distinct values.

Usage

1
well_knotted_spline(x, n_knots, verbose = TRUE)

Arguments

x

The predictor variable. A numeric vector.

n_knots

Number of knots to use.

verbose

If TRUE, produce a message about the knots chosen.

Details

Wong (1982, 1984) showed the asymptotic density of k-means in 1 dimension is proportional to the cube root of the density of x. Compared to using quantiles (the default for ns), choosing knots using k-means produces a better spread of knot locations if the distribution of values is very uneven.

k-means is computed in an optimal, deterministic way using Ckmeans.1d.dp.

Value

A matrix of predictors, similar to ns.

This function supports "safe prediction" (see makepredictcall). Original knot locations will be used for prediction with predict.

References

Wong, M. (1982). Asymptotic properties of univariate sample k-means clusters. Working paper #1341-82, Sloan School of Management, MIT. https://dspace.mit.edu/handle/1721.1/46876

Wong, M. (1984). Asymptotic properties of univariate sample k-means clusters. Journal of Classification, 1(1), 255<e2><80><93>270. https://doi.org/10.1007/BF01890126

See Also

ns, makepredictcall

Examples

1
2
3
4
5
6
7
8
9
lm(mpg ~ well_knotted_spline(wt,3), data=mtcars)

# When insufficient unique values exist, less knots are used
lm(mpg ~ well_knotted_spline(gear,3), data=mtcars)

library(ggplot2)
ggplot(diamonds, aes(carat, price)) + 
   geom_point() + 
   geom_smooth(method="lm", formula=y~well_knotted_spline(x,10))

weitrix documentation built on Nov. 8, 2020, 8:10 p.m.