# ordinal: Ordinal Smoothing Spline Basis and Penalty In npreg: Nonparametric Regression via Smoothing Splines

## Description

Generate the smoothing spline basis and penalty matrix for an ordinal spline. This basis and penalty are for an ordered factor.

## Usage

 ```1 2 3``` ```basis.ord(x, knots, K = NULL, intercept = FALSE, ridge = FALSE) penalty.ord(x, K = NULL, xlev = NULL) ```

## Arguments

 `x` Predictor variable (basis) or spline knots (penalty). Ordered factor or integer vector of length n. `knots` Spline knots. Ordered factor or integer vector of length r. `K` Number of levels of `x`. If `NULL`, this argument is defined as `K = length(unique(x))`. `xlev` Factor levels of `x` (for penalty). If `NULL`, the levels are defined as `levels(as.ordered(x))`. `intercept` If `TRUE`, the first column of the basis will be a column of ones. `ridge` If `TRUE`, the basis matrix is post-multiplied by the inverse square root of the penalty matrix. See Note and Examples.

## Details

Generates a basis function or penalty matrix used to fit ordinal smoothing splines.

With an intercept included, the basis function matrix has the form

X = [X_0, X_1]

where matrix `X_0` is an n by 1 matrix of ones, and `X_1` is a matrix of dimension n by r. The `X_0` matrix contains the "parametric part" of the basis (i.e., the intercept). The matrix `X_1` contains the "nonparametric part" of the basis, which consists of the reproducing kernel function

ρ(x, y) = 1 - (x \vee y) + (1 / 2K) * ( x(x-1) + y(y-1) ) + c

evaluated at all combinations of `x` and `knots`. The notation (x \vee y) denotes the maximum of x and y, and the constant is c = (K-1)(2K-1) / (6K).

The penalty matrix consists of the reproducing kernel function

ρ(x, y) = 1 - (x \vee y) + (1 / 2K) * ( x(x-1) + y(y-1) ) + c

evaluated at all combinations of `x`.

## Value

Basis: Matrix of dimension `c(length(x), df)` where `df = length(knots) + intercept`.

Penalty: Matrix of dimension `c(r, r)` where `r = length(x)` is the number of knots.

## Note

If the inputs `x` and `knots` are factors, they should have the same levels.

If the inputs `x` and `knots` are integers, the `knots` should be a subset of the input `x`.

If `ridge = TRUE`, the penalty matrix is the identity matrix.

## Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

## References

Gu, C. (2013). Smoothing Spline ANOVA Models. 2nd Ed. New York, NY: Springer-Verlag. doi: 10.1007/978-1-4614-5369-7

Helwig, N. E. (2017). Regression with ordered predictors via ordinal smoothing splines. Frontiers in Applied Mathematics and Statistics, 3(15), 1-13. doi: 10.3389/fams.2017.00015

Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. doi: 10.4135/9781526421036885885

See `nominal` for a basis and penalty for unordered factors.
See `polynomial` for a basis and penalty for numeric variables.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60``` ```######***###### standard parameterization ######***###### # generate data set.seed(0) n <- 101 x <- factor(sort(rep(LETTERS[1:4], length.out = n))) knots <- LETTERS[1:3] eta <- 1:4 y <- eta[x] + rnorm(n, sd = 0.5) # ordinal smoothing spline basis X <- basis.ord(x, knots, intercept = TRUE) # ordinal smoothing spline penalty Q <- penalty.ord(knots, K = 4) # pad Q with zeros (for intercept) Q <- rbind(0, cbind(0, Q)) # define smoothing parameter lambda <- 1e-5 # estimate coefficients coefs <- solve(crossprod(X) + n * lambda * Q) %*% crossprod(X, y) # estimate eta yhat <- X %*% coefs # check rmse sqrt(mean((eta[x] - yhat)^2)) ######***###### ridge parameterization ######***###### # generate data set.seed(0) n <- 101 x <- factor(sort(rep(LETTERS[1:4], length.out = n))) knots <- LETTERS[1:3] eta <- 1:4 y <- eta[x] + rnorm(n, sd = 0.5) # ordinal smoothing spline basis X <- basis.ord(x, knots, intercept = TRUE, ridge = TRUE) # ordinal smoothing spline penalty (ridge) Q <- diag(rep(c(0, 1), times = c(1, ncol(X) - 1))) # define smoothing parameter lambda <- 1e-5 # estimate coefficients coefs <- solve(crossprod(X) + n * lambda * Q) %*% crossprod(X, y) # estimate eta yhat <- X %*% coefs # check rmse sqrt(mean((eta[x] - yhat)^2)) ```