Generate a Basis Matrix for Discrete Prolate Spheroidal (Slepian) Sequences

Share:

Description

Generate the basis matrix for a particular N, W Slepian sequence family member, with the additional property that the smoother passes constants without distortion. Can be quite slow execution due to the latter property.

Based on ns for implementation with gam.

Parallel implementation for mgcv included in package as slp.mgcv.

Usage

1
2
slp(x, W = NA, K = NA, deltat = 1, naive = FALSE, intercept = FALSE, 
    customSVD = TRUE, forceC = FALSE, returnS = FALSE)

Arguments

x

the predictor variable. Missing values are allowed. Assumed to be contiguous; if not, then converted to a contiguous series to determine appropriate N, K and W, then the basis vectors are back-converted at the termination of the routine. Should be in units of deltat.

W

the time bandwidth. Computed as the frequency domain analogue of the maximum period of interest for a time series-regression problem using “smooth functions of time”. For example, a period choice of 2 months converts to 60 days and W = 1/60 cycles per day. Alternatively, if the interest is in a period of 7 cycles per year, then W = 7 / 365.2425 = 0.0192 cycles per day.

K

the number of basis vectors requested. If not provided, then W must be, and K is set to approximately floor(2 * N * W - 1). This parameter is approximately equivalent to df for ns with fixed dimension. Note: if you specify K higher than 2 * N * W + 1 performance will suffer significantly. The actual number of basis vectors returned is K-1 for the case of intercept = FALSE, and K for intercept = TRUE.

deltat

the time step for the input x. Restricted to 1 and 6 days for ease of logic checking, as these are the most traditional choices. Assumes that W is in the same units, and has no real impact beyond this, so it is trivial to make deltat symbolically equal an arbitrary choice and convert W to match.

naive

a flag for returning the naive (default) Slepian basis vectors v (TRUE) rather than the mean-passing SLP2 or SLP3 variants (FALSE).

intercept

a flag for choosing between a SLP2 or SLP3 basis. Type-2 bases capture (absorb) means of target series, while Type-3 bases ignore (pass) means.

customSVD

a flag for using the built-in svd (case FALSE) or a modified version of DGESDD LAPACK 3.5.0. The modified version provides significant speed improvements as it skips a number of unnecessary steps for the particular edge case needed by slp.

forceC

a flag for forced computation of the basis vectors. Several combinations of commonly used N, W, K parameters have been pre-computed and included with the package. If this parameter is set to TRUE, the routine will compute the basis vectors regardless of whether they are available in pre-computed form. See checkSaved{checkSaved} for further details.

returnS

a flag for returning the projection matrix S rather than the basis vectors that form the same. Intended to be used outside of model-fitting environments.

Details

slp is based around the routine .dpss, which generates a family of Discrete Prolate Spheroidal (Slepian) Sequences. These vectors are orthonormal, have alternating even/odd parity, and form the optimally concentrated basis set for the subspace of R^N corresponding to the bandwidth W. Full details are given in Slepian (1978). These basis functions have natural boundary conditions, and lack any form of knot structure. This version is returned for naive = TRUE.

The dpss basis vectors can be adapted to provide the additional useful property of capturing or passing constants perfectly. That is, the smoother matrix S formed from the returned rectangular matrix will either reproduce constants at near round-off precision, i.e., S %*% rep(1, N) = rep(1, N), for naive = FALSE with intercept = TRUE, or will pass constants, i.e., S %*% rep(1, N) = rep(0, N), for naive = FALSE with intercept = FALSE.

The primary use is in modeling formula to directly specify a Slepian time-based smoothing term in a model: see the examples.

For large N this routine can be very slow. If you are computing models with large N, we highly recommend pre-computing the basis object, then using it in your models without recomputation. The third example below demonstrates this approach.

Value

A matrix of dimension length(x) * K or length(x) * (K-1) where either K was supplied, or W was supplied and K converted. Note that the basis vectors are computed on a contiguous grid based on x, and then back-converted to the time structure of x.

Attributes are returned that correspond to the arguments to ns, and explicitly give K, W, etc.

References

Thomson, D.J (1982) Spectrum estimation and harmonic analysis. Proceedings of the IEEE. Volume 70, number 9, pp. 1055-1096.

Slepian, David (1978) Prolate Spheroidal Wave Functions, Fourier Analysis, and Uncertainty V: the Discrete Case. Bell System Technical Journal. Volume 57, pp. 1371-1429.

See Also

ns

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
    # Examples using pkg:gam
    library("gam")
    library("slp")
    N <- 730
    W <- 14 / N
    K <- 28         # will actually use 27 df when intercept = FALSE
    x <- rnorm(n = N, sd = 1)
    y <- x + rnorm(n = N, sd = 2) + 5.0
    t <- seq(1, N)    

    # note: all three examples share identical results

    # example with in-call computation, using K (df)
    fit1 <- gam(y ~ x + slp(t, K = K, forceC = TRUE), family = gaussian) 

    # example with in-call computation, using W
    fit2 <- gam(y ~ x + slp(t, W = W, forceC = TRUE), family = gaussian)

    # example with out-of-call computation, using K
    timeBasis <- slp(t, K = K, forceC = TRUE)
    fit3 <- gam(y ~ x + timeBasis, family = gaussian)

    # the same computations can be done using pre-computed basis vectors
    # for significant speed-ups, especially for large N - see `checkSaved'
    # for more details
    fit4 <- gam(y ~ x + slp(t, W = W, forceC = FALSE))