nominal: Nominal Smoothing Spline Basis and Penalty

nominalR Documentation

Nominal Smoothing Spline Basis and Penalty

Description

Generate the smoothing spline basis and penalty matrix for a nominal spline. This basis and penalty are for an unordered factor.

Usage

basis.nom(x, knots, K = NULL, intercept = FALSE, ridge = FALSE)

penalty.nom(x, K = NULL)

Arguments

x

Predictor variable (basis) or spline knots (penalty). Factor or integer vector of length n.

knots

Spline knots. Factor or integer vector of length r.

K

Number of levels of x. If NULL, this argument is defined as K = length(unique(x)).

intercept

If TRUE, the first column of the basis will be a column of ones.

ridge

If TRUE, the basis matrix is post-multiplied by the inverse square root of the penalty matrix. See Note and Examples.

Details

Generates a basis function or penalty matrix used to fit nominal smoothing splines.

With an intercept included, the basis function matrix has the form

X = [X_0, X_1]

where matrix X_0 is an n by 1 matrix of ones, and X_1 is a matrix of dimension n by r.

The X_0 matrix contains the "parametric part" of the basis (i.e., the intercept). The matrix X_1 contains the "nonparametric part" of the basis, which consists of the reproducing kernel function

\rho(x, y) = \delta_{x y} - 1/K

evaluated at all combinations of x and knots. The notation \delta_{x y} denotes Kronecker's delta function.

The penalty matrix consists of the reproducing kernel function

\rho(x, y) = \delta_{x y} - 1/K

evaluated at all combinations of x.

Value

Basis: Matrix of dimension c(length(x), df) where df = length(knots) + intercept.

Penalty: Matrix of dimension c(r, r) where r = length(x) is the number of knots.

Note

If the inputs x and knots are factors, they should have the same levels.

If the inputs x and knots are integers, the knots should be a subset of the input x.

If ridge = TRUE, the penalty matrix is the identity matrix.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Gu, C. (2013). Smoothing Spline ANOVA Models. 2nd Ed. New York, NY: Springer-Verlag. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-1-4614-5369-7")}

Helwig, N. E. (2017). Regression with ordered predictors via ordinal smoothing splines. Frontiers in Applied Mathematics and Statistics, 3(15), 1-13. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3389/fams.2017.00015")}

Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.4135/9781526421036885885")}

Helwig, N. E., & Ma, P. (2015). Fast and stable multiple smoothing parameter selection in smoothing spline analysis of variance models with large samples. Journal of Computational and Graphical Statistics, 24(3), 715-732. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/10618600.2014.926819")}

See Also

See ordinal for a basis and penalty for ordered factors.

Examples

######***######   standard parameterization   ######***######

# generate data
set.seed(0)
n <- 101
x <- factor(sort(rep(LETTERS[1:4], length.out = n)))
knots <- LETTERS[1:3]
eta <- 1:4
y <- eta[x] + rnorm(n, sd = 0.5)

# nominal smoothing spline basis
X <- basis.nom(x, knots, intercept = TRUE)

# nominal smoothing spline penalty
Q <- penalty.nom(knots, K = 4)

# pad Q with zeros (for intercept)
Q <- rbind(0, cbind(0, Q))

# define smoothing parameter
lambda <- 1e-5

# estimate coefficients
coefs <- solve(crossprod(X) + n * lambda * Q) %*% crossprod(X, y)

# estimate eta
yhat <- X %*% coefs

# check rmse
sqrt(mean((eta[x] - yhat)^2))



######***######   ridge parameterization   ######***######

# generate data
set.seed(0)
n <- 101
x <- factor(sort(rep(LETTERS[1:4], length.out = n)))
knots <- LETTERS[1:3]
eta <- 1:4
y <- eta[x] + rnorm(n, sd = 0.5)

# nominal smoothing spline basis
X <- basis.nom(x, knots, intercept = TRUE, ridge = TRUE)

# nominal smoothing spline penalty (ridge)
Q <- diag(rep(c(0, 1), times = c(1, ncol(X) - 1)))

# define smoothing parameter
lambda <- 1e-5

# estimate coefficients
coefs <- solve(crossprod(X) + n * lambda * Q) %*% crossprod(X, y)

# estimate eta
yhat <- X %*% coefs

# check rmse
sqrt(mean((eta[x] - yhat)^2))


npreg documentation built on May 29, 2024, 4:17 a.m.

Related to nominal in npreg...