ordspline: Fits Ordinal Smoothing Spline

Description Usage Arguments Details Value Warnings Note Author(s) References Examples

Description

Given a real-valued response vector \mathbf{y}=\{y_{i}\}_{n\times1} and an ordinal predictor vector \mathbf{x}=\{x_{i}\}_{n\times 1} with x_{i} \in \{1,…,K\} \ \forall i, an ordinal smoothing spline model has the form

y_{i}=η(x_{i})+e_{i}

where y_{i} is the i-th observation's respone, x_{i} is the i-th observation's predictor, η is an unknown function relating the response and predictor, and e_{i}\sim\mathrm{N}(0,σ^{2}) is iid Gaussian error.

Usage

1
ordspline(x, y, knots, weights, lambda, monotone=FALSE)

Arguments

x

Predictor vector.

y

Response vector. Must be same length as x.

knots

Either a scalar giving the number of equidistant knots to use, or a vector of values to use as the spline knots. If left blank, the number of knots is min(50, nu) where nu = length(unique(x)).

weights

Weights vector (for weighted penalized least squares). Must be same length as x and contain non-negative values.

lambda

Smoothing parameter. If left blank, lambda is tuned via Generalized Cross-Validation.

monotone

If TRUE, the relationship between x and y is constrained to be monotonic increasing.

Details

To estimate η I minimize the penalized least-squares functional

\frac{1}{n}∑_{i=1}^{n}(y_{i}-η(x_{i}))^{2}+λ ∑_{x=2}^K [η(x)-η(x-1)]^2 dx

where λ≥q0 is a smoothing parameter that controls the trade-off between fitting and smoothing the data.

Default use of the function estimates λ by minimizing the GCV score:

\mbox{GCV}(λ) = \frac{n\|(\mathbf{I}_{n}-\mathbf{S}_{λ})\mathbf{y}\|^{2}}{[n-\mathrm{tr}(\mathbf{S}_{λ})]^2}

where \mathbf{I}_{n} is the identity matrix and \mathbf{S}_{λ} is the smoothing matrix.

Value

fitted.values

Vector of fitted values.

se.fit

Vector of standard errors of fitted.values.

sigma

Estimated error standard deviation, i.e., \hat{σ}.

lambda

Chosen smoothing parameter.

info

Model fit information: vector containing the GCV, R-squared, AIC, and BIC of fit model (assuming Gaussian error).

coef

Spline basis function coefficients.

coef.csqrt

Matrix square-root of covariace matrix of coef. Use tcrossprod(coef.csqrt) to get covariance matrix of coef.

n

Number of data points, i.e., length(x).

df

Effective degrees of freedom (trace of smoothing matrix).

xunique

Unique elements of x.

x

Predictor vector (same as input).

y

Response vector (same as input).

residuals

Residual vector, i.e., y - fitted.values.

knots

Spline knots used for fit.

monotone

Logical (same as input).

Warnings

When inputting user-specified knots, all values in knots must match a corresponding value in x.

Note

The spline is estimated using penalized least-squares, which does not require the Gaussian error assumption. However, the spline inference information (e.g., standard errors and fit information) requires the Gaussian error assumption.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer.

Helwig, N. E. (2013). Fast and stable smoothing spline analysis of variance models for large samples with applications to electroencephalography data analysis. Unpublished doctoral dissertation. University of Illinois at Urbana-Champaign.

Helwig, N. E. (2017). Regression with ordered predictors via ordinal smoothing splines. Frontiers in Applied Mathematics and Statistics, 3(15), 1-13.

Helwig, N. E. and Ma, P. (2015). Fast and stable multiple smoothing parameter selection in smoothing spline analysis of variance models with large samples. Journal of Computational and Graphical Statistics, 24, 715-732.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
##########   EXAMPLE   ##########

# generate some data
n <- 100
nk <- 50
x <- seq(-3,3,length.out=n)
eta <- (sin(2*x/pi) + 0.25*x^3 + 0.05*x^5)/15
set.seed(1)
y <- eta + rnorm(n, sd=0.5)

# plot data and true eta
plot(x, y)
lines(x, eta, col="blue", lwd=2)

# fit ordinal smoothing spline
ossmod <- ordspline(x, y, knots=nk)
lines(ossmod$x, ossmod$fit, col="red", lwd=2)

# fit monotonic smoothing spline
mssmod <- ordspline(x, y, knots=nk, monotone=TRUE)
lines(mssmod$x, mssmod$fit, col="purple", lwd=2)

bigsplines documentation built on May 2, 2019, 9:27 a.m.