Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/SURE.trendfilter.R
SURE.trendfilter estimates the fixed-input
mean-squared error of a trend filtering estimator (via Stein's unbiased risk
estimate) on a grid of values for the hyperparameter gamma, and
returns the full error curve and the optimized trend filtering estimate
within a larger list with useful ancillary information.
1 2 3 4 5 6 7 8 9 10 11 12 | SURE.trendfilter(
x,
y,
weights,
k = 2L,
ngammas = 250L,
gammas = NULL,
nx.eval = 1500L,
x.eval = NULL,
thinning = NULL,
optimization.params = trendfilter.control.list(max_iter = 600L, obj_tol = 1e-10)
)
|
x |
The vector of observed values of the input variable (a.k.a. the predictor, covariate, explanatory variable, regressor, independent variable, control variable, etc.) |
y |
The vector of observed values of the output variable (a.k.a. the response, target, outcome, regressand, dependent variable, etc.). |
weights |
Currently mandatory. If the uncertainty in the
observed outputs is not well understood, use |
k |
The degree of the trend filtering estimator. Defaults to
|
ngammas |
Integer. The number of trend filtering hyperparameter values to run the grid search over. |
gammas |
Overrides |
nx.eval |
Integer. The length of the equally-spaced |
x.eval |
Overrides |
thinning |
logical. If |
optimization.params |
a named list of parameters produced by the
glmgen function
|
This will contain a very detailed description...
An object of class 'SURE.trendfilter'. This is a list with the following elements:
x.eval |
The grid of inputs the optimized trend filtering estimate was evaluated on. |
tf.estimate |
The optimized trend filtering estimate of the signal,
evaluated on |
validation.method |
"SURE" |
gammas |
Vector of hyperparameter values tested during validation (always returned in descending order). |
errors |
Vector of SURE error estimates corresponding to the *descending* set of gamma values tested during validation. |
gamma.min |
Hyperparameter value that minimizes the SURE error curve. |
edfs |
Vector of effective degrees of freedom for all trend filtering estimators fit during validation. |
edf.min |
The effective degrees of freedom of the optimally-tuned trend filtering estimator. |
i.min |
The index of |
x |
The vector of the observed inputs. |
y |
The vector of the observed outputs. |
weights |
A vector of weights for the observed outputs. These are
defined as |
fitted.values |
The optimized trend filtering estimate of the signal,
evaluated at the observed inputs |
residuals |
|
k |
The degree of the trend filtering estimator. |
thinning |
logical. If |
optimization.params |
a list of parameters that control the trend filtering convex optimization. |
n.iter |
Vector of the number of iterations needed for the ADMM
algorithm to converge within the given tolerance, for each hyperparameter
value. If many of these are exactly equal to |
x.scale, y.scale, data.scaled |
for internal use. |
Collin A. Politsch, collinpolitsch@gmail.com
Trend filtering with Stein's unbiased risk estimate
Politsch et al. (2020a). Trend filtering – I. A modern
statistical tool for time-domain astronomy and astronomical spectroscopy.
Monthly Notices of the Royal Astronomical Society, 492(3),
p. 4005-4018.
[Link]
Politsch et al. (2020b). Trend Filtering – II. Denoising
astronomical signals with varying degrees of smoothness. Monthly
Notices of the Royal Astronomical Society, 492(3), p. 4019-4032.
[Link]
Estimating effective degrees of freedom in trend filtering
Tibshirani and Taylor (2012). Degrees of freedom in lasso problems.
The Annals of Statistics, 40(2), p. 1198-1232.
[Link]
Stein's unbiased risk estimate
Tibshirani and Wasserman (2015). Stein’s Unbiased Risk Estimate.
36-702: Statistical Machine Learning course notes (Carnegie Mellon).
[Link]
Efron (2014). The Estimation of Prediction Error: Covariance Penalties
and Cross-Validation. Journal of the American Statistical Association,
99(467), p. 619-632.
[Link]
Stein (1981). Estimation of the Mean of a Multivariate Normal Distribution. The Annals of Statistics, 9(6), p. 1135-1151. [Link]
Trend filtering optimization algorithm
Ramdas and Tibshirani (2016). Fast and Flexible ADMM Algorithms
for Trend Filtering. Journal of Computational and Graphical
Statistics, 25(3), p. 839-858.
[Link]
Arnold, Sadhanala, and Tibshirani (2014). Fast algorithms for
generalized lasso problems. R package glmgen, Version 0.0.3.
[Link]
(Implementation of Ramdas and Tibshirani algorithm)
cv.trendfilter, bootstrap.trendfilter
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | #############################################################################
## Quasar Lyman-alpha forest example ##
#############################################################################
# A quasar is an extremely luminous galaxy with an active supermassive black
# hole at its center. Absorptions in the spectra of quasars at vast
# cosmological distances from our galaxy reveal the presence of a gaseous
# medium permeating the entirety of intergalactic space -- appropriately
# named the 'intergalactic medium'. These absorptions allow astronomers to
# study the structure of the Universe using the distribution of these
# absorptions in quasar spectra. Particularly important is the 'forest' of
# absorptions that arise from the Lyman-alpha spectral line, which traces
# the presence of electrically neutral hydrogen in the intergalactic medium.
#
# Here, we are interested in denoising the Lyman-alpha forest of a quasar
# spectroscopically measured by the Sloan Digital Sky Survey. SDSS spectra
# are equally spaced in log10 wavelength space, aside from some instances of
# masked pixels.
data(quasar_spec)
# head(data)
#
# | log10.wavelength| flux| weights|
# |----------------:|----------:|---------:|
# | 3.5529| 0.4235348| 0.0417015|
# | 3.5530| -2.1143005| 0.1247811|
# | 3.5531| -3.7832341| 0.1284383|
SURE.out <- SURE.trendfilter(x = data$log10.wavelength,
y = data$flux,
weights = data$weights)
# Extract the estimated hyperparameter error curve and optimized trend
# filtering estimate from the `SURE.trendfilter` output, and transform the
# input grid to wavelength space (in Angstroms).
log.gammas <- log(SURE.out$gammas)
errors <- SURE.out$errors
log.gamma.min <- log(SURE.out$gamma.min)
wavelength <- 10 ^ (SURE.out$x)
wavelength.eval <- 10 ^ (SURE.out$x.eval)
tf.estimate <- SURE.out$tf.estimate
# Plot the results
par(mfrow = c(2,1), mar = c(5,4,2.5,1) + 0.1)
plot(x = log.gammas, y = errors, main = "SURE error curve",
xlab = "log(gamma)", ylab = "SURE error")
abline(v = log.gamma.min, lty = 2, col = "blue3")
text(x = log.gamma.min, y = par("usr")[4],
labels = "optimal gamma", pos = 1, col = "blue3")
plot(x = wavelength, y = SURE.out$y, type = "l",
main = "Quasar Lyman-alpha forest",
xlab = "Observed wavelength (Angstroms)", ylab = "Flux")
lines(wavelength.eval, tf.estimate, col = "orange", lwd = 2.5)
legend(x = "topleft", lwd = c(1,2), lty = 1, col = c("black","orange"),
legend = c("Noisy quasar Lyman-alpha forest",
"Trend filtering estimate"))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.