bigtps: Fits Cubic Thin-Plate Splines
In taylerablake/thin-plate-splines: Smoothing Splines for Large Samples

Description Usage Arguments Details Value Warnings Computational Details Note Author(s) References Examples

Given a real-valued response vector \mathbf{y}=\{y_{i}\}_{n\times1}, a thin-plate spline model has the form

y_{i}=η(\mathbf{x}_{i})+e_{i}

where y_{i} is the i-th observation's respone, \mathbf{x}_{i}=(x_{i1},…,x_{id}) is the i-th observation's nonparametric predictor vector, η is an unknown smooth function relating the response and predictor, and e_{i}\sim\mathrm{N}(0,σ^{2}) is iid Gaussian error. Function only fits interaction models.

1
2
3

bigtps(x,y,nknots=NULL,nvec=NULL,rparm=NA,
       alpha=1,lambdas=NULL,se.fit=FALSE,
       rseed=1234,knotcheck=TRUE)

`x`	Predictor vector or matrix with three or less columns.
`y`	Response vector. Must be same length as `x` has rows.
`nknots`	Two possible options: (a) scalar giving total number of random knots to sample, or (b) vector indexing which rows of `x` to use as knots.
`nvec`	Number of eigenvectors (and eigenvalues) to use in approximation. Must be less than or equal to the number of knots and greater than or equal to `ncol(x)+2`. Default sets `nvec<-nknots`. Can also input `0<nvec<1` to retain `nvec` percentage of eigenbasis variation.
`rparm`	Rounding parameter(s) for `x`. Use `rparm=NA` to fit unrounded solution. Can provide one (positive) rounding parameter for each column of `x`.
`alpha`	Manual tuning parameter for GCV score. Using `alpha=1` gives unbaised esitmate. Using a larger alpha enforces a smoother estimate.
`lambdas`	Vector of global smoothing parameters to try. Default estimates smoothing parameter that minimizes GCV score.
`se.fit`	Logical indicating if the standard errors of fitted values should be estimated.
`rseed`	Random seed for knot sampling. Input is ignored if `nknots` is an input vector of knot indices. Set `rseed=NULL` to obtain a different knot sample each time, or set `rseed` to any positive integer to use a different seed than the default.
`knotcheck`	If `TRUE`, only unique knots are used (for stability).

To estimate η I minimize the penalized least-squares functional

\frac{1}{n}∑_{i=1}^{n}(y_{i}-η(\mathbf{x}_{i}))^{2}+λ J(η)

where J(η) is the thin-plate penalty (see Helwig and Ma) and λ≥q0 is a smoothing parameter that controls the trade-off between fitting and smoothing the data. Default use of the function estimates λ by minimizing the GCV score (see bigspline).

Using the rounding parameter input rparm can greatly speed-up and stabilize the fitting for large samples. When rparm is used, the spline is fit to a set of unique data points after rounding; the unique points are determined using the efficient algorithm described in Helwig (2013). Rounding parameter should be on the raw data scale.

`fitted.values`	Vector of fitted values corresponding to the original data points in `x` (if `rparm=NA`) or the rounded data points in `xunique` (if `rparm` is used).
`se.fit`	Vector of standard errors of `fitted.values` (if input `se.fit=TRUE)`.
`x`	Predictor vector (same as input).
`y`	Response vector (same as input).
`xunique`	Unique elements of `x` after rounding (if `rparm` is used).
`yunique`	Mean of `y` for unique elements of `x` after rounding (if `rparm` is used).
`funique`	Vector giving frequency of each element of `xunique` (if `rparm` is used).
`sigma`	Estimated error standard deviation, i.e., \hat{σ}.
`ndf`	Data frame with two elements: `n` is total sample size, and `df` is effective degrees of freedom of fit model (trace of smoothing matrix).
`info`	Model fit information: vector containing the GCV, multiple R-squared, AIC, and BIC of fit model (assuming Gaussian error).
`myknots`	Spline knots used for fit.
`nvec`	Number of eigenvectors used for solution.
`rparm`	Rounding parameter for `x` (same as input).
`lambda`	Optimal smoothing parameter.
`coef`	Spline basis function coefficients.
`coef.csqrt`	Matrix square-root of covariace matrix of `coef`. Use `tcrossprod(coef.csqrt)` to get covariance matrix of `coef`.

Input nvec must be greater than ncol(x)+1.

When using rounding parameters, output fitted.values corresponds to unique rounded predictor scores in output xunique. Use predict.bigtps function to get fitted values for full y vector.

According to thin-plate spline theory, the function η can be approximated as

η(x) = ∑_{k=1}^{M}d_{k}φ_{k}(\mathbf{x}) + ∑_{h=1}^{q}c_{h}ξ(\mathbf{x},\mathbf{x}_{h}^{*})

where the \{φ_{k}\}_{k=1}^{M} are linear functions, ξ is the thin-plate spline semi-kernel, \{\mathbf{x}_{h}^{*}\}_{h=1}^{q} are the knots, and the c_{h} coefficients are constrained to be orthongonal to the \{φ_{k}\}_{k=1}^{M} functions.

This implies that the penalized least-squares functional can be rewritten as

\|\mathbf{y} - \mathbf{K}\mathbf{d} - \mathbf{J}\mathbf{c}\|^{2} + nλ\mathbf{c}'\mathbf{Q}\mathbf{c}

where \mathbf{K}=\{φ(\mathbf{x}_{i})\}_{n \times M} is the null space basis function matrix, \mathbf{J}=\{ξ(\mathbf{x}_{i},\mathbf{x}_{h}^{*})\}_{n \times q} is the contrast space basis funciton matrix, \mathbf{Q}=\{ξ(\mathbf{x}_{g}^{*},\mathbf{x}_{h}^{*})\}_{q \times q} is the penalty matrix, and \mathbf{d}=(d_{0},…,d_{M})' and \mathbf{c}=(c_{1},…,c_{q})' are the unknown basis function coefficients, where \mathbf{c} are constrained to be orthongonal to the \{φ_{k}\}_{k=1}^{M} functions.

See Helwig and Ma for specifics about how the constrained estimation is handled.

The spline is estimated using penalized least-squares, which does not require the Gaussian error assumption. However, the spline inference information (e.g., standard errors and fit information) requires the Gaussian error assumption.

Nathaniel E. Helwig <helwig@umn.edu>

Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer.

Helwig, N. E. and Ma, P. (2015). Fast and stable multiple smoothing parameter selection in smoothing spline analysis of variance models with large samples. Journal of Computational and Graphical Statistics, 24, 715-732.

Helwig, N. E. and Ma, P. (2016). Smoothing spline ANOVA for super-large samples: Scalable computation via rounding parameters. Statistics and Its Interface, 9, 433-444.

##########   EXAMPLE 1   ##########

# define relatively smooth function
set.seed(773)
myfun <- function(x){ sin(2*pi*x) }
x <- runif(500)
y <- myfun(x) + rnorm(500)

# fit thin-plate spline (default 1 dim: 30 knots)
tpsmod <- bigtps(x,y)
tpsmod


##########   EXAMPLE 2   ##########

# define more jagged function
set.seed(773)
myfun <- function(x){ 2*x+cos(2*pi*x) }
x <- runif(500)*4
y <- myfun(x) + rnorm(500)

# try different numbers of knots
r1mod <- bigtps(x,y,nknots=20,rparm=0.01)
crossprod( myfun(r1mod$xunique) - r1mod$fitted )/length(r1mod$fitted)
r2mod <- bigtps(x,y,nknots=35,rparm=0.01)
crossprod( myfun(r2mod$xunique) - r2mod$fitted )/length(r2mod$fitted)
r3mod <- bigtps(x,y,nknots=50,rparm=0.01)
crossprod( myfun(r3mod$xunique) - r3mod$fitted )/length(r3mod$fitted)


##########   EXAMPLE 3   ##########

# function with two continuous predictors
set.seed(773)
myfun <- function(x1v,x2v){
  sin(2*pi*x1v) + log(x2v+.1) + cos(pi*(x1v-x2v))
}
x <- cbind(runif(500),runif(500))
y <- myfun(x[,1],x[,2]) + rnorm(500)

# fit thin-plate spline with 50 knots (default 2 dim: 100 knots)
tpsmod <- bigtps(x,y,nknots=50)
tpsmod
crossprod( myfun(x[,1],x[,2]) - tpsmod$fitted.values )/500


##########   EXAMPLE 4   ##########

# function with three continuous predictors
set.seed(773)
myfun <- function(x1v,x2v,x3v){
  sin(2*pi*x1v) + log(x2v+.1) + cos(pi*x3v)
  }
x <- cbind(runif(500),runif(500),runif(500))
y <- myfun(x[,1],x[,2],x[,3]) + rnorm(500)

# fit thin-plate spline with 50 knots (default 3 dim: 200 knots)
tpsmod <- bigtps(x,y,nknots=50)
tpsmod
crossprod( myfun(x[,1],x[,2],x[,3]) - tpsmod$fitted.values )/500

taylerablake/thin-plate-splines documentation built on May 8, 2019, 11:16 p.m.

taylerablake/thin-plate-splines index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

taylerablake/thin-plate-splines
Smoothing Splines for Large Samples

bigtps: Fits Cubic Thin-Plate Splines
In taylerablake/thin-plate-splines: Smoothing Splines for Large Samples

Description

Usage

Arguments

Details

Value

Warnings

Computational Details

Note

Author(s)

References

Examples

Related to bigtps in taylerablake/thin-plate-splines...

R Package Documentation

Browse R Packages

We want your feedback!

taylerablake/thin-plate-splines Smoothing Splines for Large Samples

bigtps: Fits Cubic Thin-Plate Splines In taylerablake/thin-plate-splines: Smoothing Splines for Large Samples

Description

Usage

Arguments

Details

Value

Warnings

Computational Details

Note

Author(s)

References

Examples

Related to bigtps in taylerablake/thin-plate-splines...

R Package Documentation

Browse R Packages

We want your feedback!

taylerablake/thin-plate-splines
Smoothing Splines for Large Samples

bigtps: Fits Cubic Thin-Plate Splines
In taylerablake/thin-plate-splines: Smoothing Splines for Large Samples