WiSEBoot: Wild Scale-Enhanced (WiSE) Bootstrap for Model Selection

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/WiSEBoot.R

Description

Perform the WiSE bootstrap to estimate parameters within models of the form

Y = γ_0 1 + γ_1 t + Wγ + e

Automatically select a threshold level for γ, or the user may specify the threshold. This function also provides the WiSE bootstrap samples of the wavelet coefficients (for the selected threshold) and bootstrap samples of the linear parameters (γ_0, γ_1).

Usage

1
2
WiSEBoot(X, R=100, XParam = NA, TauSq = "log", bootDistn = "normal", by.row = FALSE, 
         J0 = NA, wavFam = "DaubLeAsymm", wavFil = 8, wavBC = "periodic")

Arguments

X

a matrix or vector of equally-spaced data. All entries must be non-missing and numeric. If a vector is supplied, the length must be T=2^J where J is a positive integer. If a matrix is supplied, each data series must be of length T.

R

number of bootstrap samples. Allowed value is a positive integer. Default is 100.

XParam

vector or matrix of linear trend parameters for the data series. If a vector is supplied in X, this should be a vector of length 2. If a matrix is supplied in X, this should be a matrix with 2 rows and an equal of columns (dim(X)[2]=dim(XParam)[2]). If NA, WiSEBoot will automatically estimate linear parameters via least squares. This quantity may be supplied by the linearParam return argument of padVector or padMatrix. See Details below.

TauSq

scale parameter for the bootstrap. Allowed values are "log", "log10", "sqrt", "1", or "2/5". The scale parameter is related to the length of the data series. For example, "log" implies a value of the scale parameter, τ, of √{log(T)}. The value of "1" creates an equivalent situation to wild bootstrap.

bootDistn

the distribution for the bootstrap. Allowed values are "normal", "uniform", "laplace", "lognormal", "gumbel", "exponential", "t5", "t8", and "t14". This draws iid random samples from the specified distribution for the wild bootstrap where the random variables have mean 0 and variance 1. For example, "t5" is Student's t-distribution with 5 degrees of freedom.

by.row

logical indicator of observation location. If TRUE, the observations are by row and the columns contain different data series. If FALSE, the rows contain different data series and the observations are by column.

J0

wavelet filter coefficient threshold. Allowed values are NA and any integer between 0 and J-2 (when the data series is of length T=2^J). If a specific integer is given, all wavelet coefficients at levels finer than J0 are set to 0. If NA, the WiSEBoot creates bootstrap samples for all thresholds between 0 and J-2. The selected threshold minimizes the mean of the MSE.

wavFam

wavelet family. Allowed values are "DaubLeAsymm" and "DaubExPhase" – Daubechies Least Asymmetric and Daubechies Extremal Phase. This is the family used within the wavethresh package.

wavFil

wavelet filter number. Allowed values are integers between 4 and 10 when wavFam="DaubLeAsymm" or integers between 1 and 10 when wavFam="DaubExPhase". These correspond to the number of vanishing moments of the wavelet. This is the filter.number used within the wavethresh package.

wavBC

wavelet boundary condition. Allowed values are "periodic" and "symmetric". This is the bc used within the wavethresh package.

Details

The assumed model is

Y = γ_0 1 + γ_1 t + Wγ + e

where Y is the data vector, linear parameters in time (t) are γ_0 and γ_1, γ are the wavelet coefficients (scaling and filter), and W is the DWT for a fixed wavelet basis. Note, in many cases of the DWT, the scaling coefficient is equivalent to γ_0, and thus, estimated there.

This model requires estimation of linear terms γ_0 and γ_1. It is recommended, if the data is padded to a length T=2^J using padVector or padMatrix, to supply the linearParam estimates and call replaceLinearTrend=FALSE. If XParam is NA, the WiSEBoot function will estimate γ_0 and γ_1 from the supplied data using least squares.

J0 sets the threshold within the wavelet coefficients. Our threshold is defined as the level above which all fine wavelet coefficients are set to 0.

For a single data series, Y, the WiSE bootstrap sample is obtained by

1. Find estimates of γ_0 and γ_1: g0 and g1. If supplied, this is linearParam.

2. Estimate all levels of wavelet coefficients, γ, using the residuals r = Y - g0 1 - g1 t. Call these estimated coefficients g.

3. For a set threshold, J0=j, set all coefficients in g finer than j to 0. Call this thresholded set of coefficients g_j Perform the inverse wavelet transform with g_j. This smooth series may be called rSmooth.

4. Calculate the wavelet residuals using rWave = r - rSmooth.

5. A single bootstrap sample is defined as Y^* = g0 1 + g1 t + W g_j + τ N(0,1) rWave. Y^* is used to obtain estimates for the un-thresholded wavelet coefficients and linear parameters.

Value

MSECriteria

matrix with 2 columns. The first column contains integer values corresponding to various J0 + 1. The second column contains the mean of the MSE. The MSE is computed using the estimated (smooth) bootstrap sample and the original data. The selected model minimizes the mean of MSE.

BootIntercept

matrix of R rows. Each row corresponds to a single bootstrap sample estimate of γ_0. The number of columns in the matrix corresponds to the number of data series supplied to the function (in X). The order of the data series supplied matches to the order of the columns of BootIntercept.

BootSlope

matrix of R rows. Each row corresponds to a single bootstrap sample estimate of γ_1. The number of columns in the matrix corresponds to the number of data series supplied to the function (in X). The order of the data series supplied matches to the order of the columns of BootSlope.

BootWavelet

array of bootstrap estimates of the wavelet coefficients. The first dimension is R (bootstrap sample). The second dimension is the T=2^J (data series length). The order of wavelet coefficients in the second dimension is: scaling level 0, filter level 0 (coarsest), filter level 1, ..., filter level J-1 (finest). The third dimension is the number of data series supplied. This array does not contain any boundary coefficients generated using the wavBC="symmetric" option.

DataWavelet

matrix of wavelet coefficients from the data. This matrix only contains the coefficients from the selected model (fine level coefficients are set to 0). The number of rows is T=2^J and the order of coefficients within these rows matches the order of coefficients in the columns of BootWavelet. Thus, the first row contains the scaling coefficient, the second row contains the filter level 0 coefficient, etc. The number of columns matches the number of data series supplied, and are ordered as in X.

XParam

matrix of linear slope and intercept in time from the data. If supplied, same as user-specified. Otherwise, estimated using least squares.

wavFam, wavFil, wavBC, TauSq, BootDistn, by.row

Same as supplied to the function.

Author(s)

Megan Heyman

References

The WiSE bootstrap methodology is defined in theoretical detail in Chatterjee, S. et al. "WiSE bootstrap for model selection" (in progress).

See Also

padVector, padMatrix, wavethresh-package

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
##R=10 bootstrap samples is not recommended.  For demonstration only.

##bootstrap one of the simulated series, threshold level 4 (not the truth)
data(SimulatedSNR15Series)
bootObj <- WiSEBoot(SimulatedSNR15Series[, 3], R=10, J0=4)

#boxplot of the bootstrap intercept and slope estimates (both 0 in truth)
par(mfrow=c(1,2))
boxplot(bootObj$BootIntercept); boxplot(bootObj$BootSlope)

#boxplot of the bootstrap wavelet coefficient estimates, level 1
par(mfrow=c(1,2))
boxplot(bootObj$BootWavelet[ , 3, 1]); boxplot(bootObj$BootWavelet[ , 4, 1])


##See what smooth level the bootstrap chooses (truth is J0=2)
bootObj2 <- WiSEBoot(SimulatedSNR15Series[ ,3], R=10)
bootObj2$MSECriteria

WiSEBoot documentation built on May 30, 2017, 3:32 a.m.