HierBasis: Nonparametric Regression using Hierarchical Basis Functions
In asadharis/HierBasis: Nonparametric Regression and Sparse Additive Modeling

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/hier_basis.R

The main function for univariate non-parametric regression via the HierBasis estimator.

HierBasis(x, y, nbasis = length(y), max.lambda = NULL, nlam = 50,
  lam.min.ratio = 1e-04, m.const = 3, type = c("gaussian",
  "binomial"), max.iter = 100, tol = 0.001, max.iter.inner = 100,
  tol.inner = 0.001, refit = FALSE)

`x`	A vector of dependent variables.
`y`	A vector of response values we wish to fit the function to.
`nbasis`	The number of basis functions. Default is length(y).
`max.lambda`	The maximum lambda value for the penalized regression problem. If `NULL` (default), then the function selects a maximum lambda value such that the fitted function is the trivial estimate, i.e. the mean of `y`.
`nlam`	Number of lambda values for fitting penalized regression. The functions uses a sequence of `nlam` lambda values on the log-scale ranging from `max.lambda` to `max.lambda * lam.min.ratio`.
`lam.min.ratio`	The ratio of the largest and smallest lambda value.
`m.const`	The order of smoothness, usually not more than 3 (default).
`type`	Specifies type of regression, "gaussian" is for linear regression with continuous response and "binomial" is for logistic regression with binary response.
`max.iter`	Maximum number of iterations for outer loop for logistic regression.
`tol`	Tolerance for convergence of outer loop.
`max.iter.inner`	Maximum number of iterations for inner loop for logistic regression.
`tol.inner`	Tolerance for convergence of inner loop.
`refit`	If `TRUE`, function returns the re-fitted model, i.e. the least squares estimates based on the sparsity pattern. Currently the functionality is only available for the least squares loss, i.e. `type == "gaussian"`.

One of the main functions of the HierBasis package. This function fit a univariate nonparametric regression model. This is achieved by minimizing the following function of β:

minimize_{β} (1/2n)||y - Ψ β||^2 + λΩ_m(β) ,

where β is a vector of length J = nbasis. The penalty function Ω_m is given by

∑ a_{j,m}β[j:J],

where β[j:J] is beta[j:J] for a vector beta. Finally, the weights a_{j,m} are given by

a_{j,m} = j^m - (j-1)^m,

where m denotes the 'smoothness level'. For details see Haris et al. (2018).

An object of class HierBasis with the following elements:

`beta`	The `nbasis * nlam` matrix of estimated beta vectors.
`intercept`	The vector of size `nlam` of estimated intercepts.
`fitted.values`	The `nbasis * nlam` matrix of fitted values.
`lambdas`	The sequence of lambda values used for fitting the different models.
`x, y`	The original `x` and `y` values used for estimation.
`m.const`	The `m.const` value used for defining 'order' of smoothness.
`nbasis`	The maximum number of basis functions we allowed the method to fit.
`active`	The vector of length nlam. Giving the size of the active set.
`xbar`	The means of the vectors `x, x^2, x^3, ..., x^nbasis`.
`ybar`	The mean of the vector y.
`refit.mod`	An additional refitted model, including yhat, beta and intercepts. Only if `refit == TRUE`.

Asad Haris (aharis@uw.edu), Ali Shojaie and Noah Simon

Haris, A., Shojaie, A. and Simon, N. (2018). Nonparametric Regression with Adaptive Smoothness via a Convex Hierarchical Penalty. Available on request by authors.

predict.HierBasis, GetDoF.HierBasis

require(Matrix)

set.seed(1)

# Generate the points x.
n <- 300
x <- (1:300)/300

# A simple quadratic function.
y1 <- 5 * (x - 0.5)^2
y1dat <- y1 + rnorm(n, sd = 0.1)

# A sine wave example.
y2 <- - sin(10 * x - 4)
y2dat <- y2 + rnorm(n, sd = 0.2)

# An exponential function.
y3 <- exp(- 5 * x + 0.5)
y3dat <- y3 + rnorm(n, sd = 0.2)

poly.fit <- HierBasis(x, y1dat)
sine.fit <- HierBasis(x, y2dat)
exp.fit  <- HierBasis(x, y3dat)


plot(x, y1dat, type = "p", ylab = "y1")
lines(x, y1, lwd = 2)
lines(x, poly.fit$fitted.values[,30], col = "red", lwd = 2)

plot(x, y2dat, type = "p", ylab = "y1")
lines(x, y2, lwd = 2)
lines(x, sine.fit$fitted.values[,40], col = "red", lwd = 2)

plot(x, y3dat, type = "p", ylab = "y1")
lines(x, y3, lwd = 2)
lines(x, exp.fit$fitted.values[,40], col = "red", lwd = 2)