SVDsmooth: Smooth Basis Functions for Data Matrix with Missing Values

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Function that computes smooth functions for a data matrix with missing values, as described in Fuentes et. al. (2006), or does cross validation to determine a suitable number of basis functions. The function uses SVDmiss to complete the matrix and then computes smooth basis functions by applying smooth.spline to the SVD of the completed data matrix.

Usage

1
2
3
4
5
SVDsmooth(X, n.basis = min(2, dim(X)[2]), date.ind = NULL, scale = TRUE,
  niter = 100, conv.reldiff = 0.001, df = NULL, spar = NULL,
  fnc = FALSE)

SVDsmoothCV(X, n.basis, ...)

Arguments

X

Data matrix, with missing values marked by NA (use createDataMatrix). Rows and/or columns that are completely missing will be dropped (with a message), for the rows the smooths will be interpolated using predict.smooth.spline.

n.basis

Number of smooth basis functions to compute, will be passed as ncomp to SVDmiss; for SVDsmoothCV a vector with the different number of basis functions to evaluate (including 0).

date.ind

Vector giving the observation time of each row in X, used as x in
smooth.spline when computing the smooth basis functions. If missing convertCharToDate is used to coerce the rownames(X).

scale

If TRUE, will use scale to scale X before calling SVDmiss.

niter, conv.reldiff

Controls convergence, passed to SVDmiss.

df, spar

The desired degrees of freedom/smoothing parameter for the spline,
see smooth.spline

fnc

If TRUE return a function instead of the trend-matrix, see Value below.

...

Additional parameters passed to SVDsmooth; i.e. date.ind, scale, niter, conv.reldiff, df, spar, and/or fnc.

Details

SVDsmoothCV uses leave-one-column-out cross-validation; holding one column out from X, calling SVDsmooth, and then regressing the held out column on the resulting smooth functions. Cross-validation statistics computed for each of these regressions include MSE, R-squared, AIC and BIC. The weighted average (weighted by number of observations in the colum) is then reported as CV-statistics.

Value

Depends on the function:

SVDsmooth

A matrix (if fnc==FALSE) where each column is a smooth basis function based on the SVD of the completed data matrix. The left most column contains the smooth of the most important SVD. If fnc==TRUE a function that will create the data matrix if called as fnc(date.ind), fnc(1:dim(X)[1]), or fnc(convertCharToDate( rownames(X) )).

SVDsmoothCV

A list of class SVDcv with components:

CV.stat,CV.sd

data.frames with mean and standard deviation of the CV statistics for each of the number of basis functions evaluated.

MSE.all,R2.all,AIC.all,BIC.all

data.frames with the individual MSE, R2, AIC, and BIC values for each column in the data matrix and for each number of basis functions evaluated.

smoothSVD

A list with length(n.basis) components. If fnc==FALSE each component contains an array where smoothSVD[[j]][,,i] is the result of SVDsmooth applied to X[,-i] with n.basis[j] smooth functions; if fnc==FALSE each component contains a list of functions as smoothSVD[[j]][[i]].

Author(s)

Paul D. Sampson and Johan Lindstrom

References

M. Fuentes, P. Guttorp, and P. D. Sampson. (2006) Using Transforms to Analyze Space-Time Processes in Statistical methods for spatio-temporal systems (B. Finkenstadt, L. Held, V. Isham eds.) 77-150

See Also

Other SVD for missing data: SVDmiss, calcSmoothTrends, plot.SVDcv, print.SVDcv, updateTrend.STdata

Other data matrix: SVDmiss, createDataMatrix, estimateBetaFields, mesa.data.raw

Other SVDcv methods: plot.SVDcv, print.SVDcv

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
##create a data matrix
t <- seq(0,4*pi,len=50)
X.org <- cbind(cos(t),sin(2*t)) %*% matrix(rnorm(10),2,5)

##add some normal errors
X <- X.org + .25*rnorm(length(X.org))
##and mark some data as missing
X[runif(length(X))<.25] <- NA

##Ensure that we have complet columns/rows
while( any(rowSums(is.na(X))==dim(X)[2]) || any(colSums(is.na(X))==dim(X)[1]) ){
  X <- X.org + .25*rnorm(length(X.org))
  X[runif(length(X))<.25] <- NA
}

##compute two smooth basis functions
res <- SVDsmooth(X, n.basis=2, niter=100)

##or compute the function that gives the basis functions
res.fnc <- SVDsmooth(X, n.basis=2, niter=100, fnc=TRUE)

##and they are equal
summary( res.fnc()-res )


##plot the two smooth basis functions
par(mfcol=c(3,2), mar=c(4,4,.5,.5))
plot(t, res[,1], ylim=range(res), type="l")
lines(t, res[,2], col=2)
##and some of the data fitted to the smooths
for(i in 1:5){
  plot(t, X[,i])
  lines(t, predict.lm(lm(X[,i]~res), data.frame(res)) )
  lines(t, X.org[,i], col=2)
}

##compute cross-validation for 1 to 4 basis functions
res.cv <- SVDsmoothCV(X, n.basis=0:4, niter=100)

##study cross-validation results
print(res.cv)
summary(res.cv)

##plot cross-validation statistics
plot(res.cv, sd=TRUE)
##boxplot of CV statistics for each column
boxplot(res.cv)
##plot the BIC for each column
plot(res.cv, "BIC", pairs=TRUE)

SpatioTemporal documentation built on May 2, 2019, 8:49 a.m.