# slp-gam: Generate a Basis Matrix for Discrete Prolate Spheroidal... In slp: Discrete Prolate Spheroidal (Slepian) Sequence Regression Smoothers

## Description

Generate the basis matrix for a particular `N, W` Slepian sequence family member, with the additional property that the smoother passes constants without distortion. Can be quite slow execution due to the latter property.

Based on `ns` for implementation with `gam`.

Parallel implementation for `mgcv` included in package as `slp.mgcv`.

## Usage

 ```1 2``` ```slp(x, W = NA, K = NA, deltat = 1, naive = FALSE, intercept = FALSE, customSVD = TRUE, forceC = FALSE, returnS = FALSE) ```

## Arguments

 `x` the predictor variable. Missing values are allowed. Assumed to be contiguous; if not, then converted to a contiguous series to determine appropriate `N`, `K` and `W`, then the basis vectors are back-converted at the termination of the routine. Should be in units of `deltat`. `W` the time bandwidth. Computed as the frequency domain analogue of the maximum period of interest for a time series-regression problem using “smooth functions of time”. For example, a period choice of 2 months converts to 60 days and `W = 1/60` cycles per day. Alternatively, if the interest is in a period of `7` cycles per year, then `W = 7 / 365.2425 = 0.0192` cycles per day. `K` the number of basis vectors requested. If not provided, then `W` must be, and `K` is set to approximately `floor(2 * N * W - 1)`. This parameter is approximately equivalent to `df` for `ns` with fixed dimension. Note: if you specify `K` higher than `2 * N * W + 1` performance will suffer significantly. The actual number of basis vectors returned is `K-1` for the case of `intercept = FALSE`, and `K` for `intercept = TRUE`. `deltat` the time step for the input `x`. Restricted to `1` and `6` days for ease of logic checking, as these are the most traditional choices. Assumes that `W` is in the same units, and has no real impact beyond this, so it is trivial to make `deltat` symbolically equal an arbitrary choice and convert `W` to match. `naive` a flag for returning the naive (default) Slepian basis vectors `v` (`TRUE`) rather than the mean-passing SLP2 or SLP3 variants (`FALSE`). `intercept` a flag for choosing between a SLP2 or SLP3 basis. Type-2 bases capture (absorb) means of target series, while Type-3 bases ignore (pass) means. `customSVD` a flag for using the built-in `svd` (case `FALSE`) or a modified version of `DGESDD` LAPACK 3.5.0. The modified version provides significant speed improvements as it skips a number of unnecessary steps for the particular edge case needed by `slp`. `forceC` a flag for forced computation of the basis vectors. Several combinations of commonly used `N, W, K` parameters have been pre-computed and included with the package. If this parameter is set to `TRUE`, the routine will compute the basis vectors regardless of whether they are available in pre-computed form. See `checkSaved{checkSaved}` for further details. `returnS` a flag for returning the projection matrix `S` rather than the basis vectors that form the same. Intended to be used outside of model-fitting environments.

## Details

`slp` is based around the routine `.dpss`, which generates a family of Discrete Prolate Spheroidal (Slepian) Sequences. These vectors are orthonormal, have alternating even/odd parity, and form the optimally concentrated basis set for the subspace of `R^N` corresponding to the bandwidth `W`. Full details are given in Slepian (1978). These basis functions have natural boundary conditions, and lack any form of knot structure. This version is returned for `naive = TRUE`.

The `dpss` basis vectors can be adapted to provide the additional useful property of capturing or passing constants perfectly. That is, the smoother matrix `S` formed from the returned rectangular matrix will either reproduce constants at near round-off precision, i.e., `S %*% rep(1, N) = rep(1, N)`, for `naive = FALSE` with `intercept = TRUE`, or will pass constants, i.e., `S %*% rep(1, N) = rep(0, N)`, for `naive = FALSE` with `intercept = FALSE`.

The primary use is in modeling formula to directly specify a Slepian time-based smoothing term in a model: see the examples.

For large `N` this routine can be very slow. If you are computing models with large `N`, we highly recommend pre-computing the basis object, then using it in your models without recomputation. The third example below demonstrates this approach.

## Value

A matrix of dimension `length(x) * K` or `length(x) * (K-1)` where either `K` was supplied, or `W` was supplied and `K` converted. Note that the basis vectors are computed on a contiguous grid based on `x`, and then back-converted to the time structure of `x`.

Attributes are returned that correspond to the arguments to `ns`, and explicitly give `K`, `W`, etc.

## References

Thomson, D.J (1982) Spectrum estimation and harmonic analysis. Proceedings of the IEEE. Volume 70, number 9, pp. 1055-1096.

Slepian, David (1978) Prolate Spheroidal Wave Functions, Fourier Analysis, and Uncertainty V: the Discrete Case. Bell System Technical Journal. Volume 57, pp. 1371-1429.

`ns`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26``` ``` # Examples using pkg:gam library("gam") library("slp") N <- 730 W <- 14 / N K <- 28 # will actually use 27 df when intercept = FALSE x <- rnorm(n = N, sd = 1) y <- x + rnorm(n = N, sd = 2) + 5.0 t <- seq(1, N) # note: all three examples share identical results # example with in-call computation, using K (df) fit1 <- gam(y ~ x + slp(t, K = K, forceC = TRUE), family = gaussian) # example with in-call computation, using W fit2 <- gam(y ~ x + slp(t, W = W, forceC = TRUE), family = gaussian) # example with out-of-call computation, using K timeBasis <- slp(t, K = K, forceC = TRUE) fit3 <- gam(y ~ x + timeBasis, family = gaussian) # the same computations can be done using pre-computed basis vectors # for significant speed-ups, especially for large N - see `checkSaved' # for more details fit4 <- gam(y ~ x + slp(t, W = W, forceC = FALSE)) ```