expectreg.ls: Expectile regression of additive models

View source: R/expectreg.ls.R

expectreg.lsR Documentation

Expectile regression of additive models

Description

Additive models are fitted with least asymmetrically weighted squares or quadratic programming to obtain expectiles for parametric, continuous, spatial and random effects.

Usage

expectreg.ls(formula, data = NULL, estimate = c("laws", "restricted", "bundle", "sheets"),
smooth = c("schall", "ocv", "gcv", "cvgrid", "aic", "bic", "lcurve", "fixed"), 
lambda = 1, expectiles = NA, ci = FALSE, LAWSmaxCores = 1, ...)

expectreg.qp(formula, data  = NULL, id = NA, smooth = c("schall", "acv", "fixed"), 
             lambda = 1, expectiles = NA) 

Arguments

formula

An R formula object consisting of the response variable, '~' and the sum of all effects that should be taken into consideration. Each effect has to be given through the function rb.

data

Optional data frame containing the variables used in the model, if the data is not explicitely given in the formula.

id

Potential additional variable identifying individuals in a longitudinal data set. Allows for a random intercept estimation.

estimate

Character string defining the estimation method that is used to fit the expectiles. Further detail on all available methods is given below.

smooth

There are different smoothing algorithms that should prevent overfitting. The 'schall' algorithm iterates the smoothing penalty lambda until it converges (REML). The generalised cross-validation 'gcv',similar to the ordinary cross- validation 'ocv' minimizes a score-function using nlminb or with a grid search by 'cvgrid' or the function uses a fixed penalty. The numerical minimisatioin is also possible with AIC or BIC as score. The L-curve is a new experimental grid search by Frasso and Eilers.

lambda

The fixed penalty can be adjusted. Also serves as starting value for the smoothing algorithms.

expectiles

In default setting, the expectiles (0.01,0.02,0.05,0.1,0.2,0.5,0.8,0.9,0.95,0.98,0.99) are calculated. You may specify your own set of expectiles in a vector. The option may be set to 'density' for the calculation of a dense set of expectiles that enhances the use of cdf.qp and cdf.bundle afterwards.

ci

Whether a covariance matrix for confidence intervals and a summary is calculated.

LAWSmaxCores

How many cores should maximal be used by parallelization

...

Optional value for re-weight the model with estimate weights and combine selected models to one model.

Details

In least asymmetrically weighted squares (LAWS) each expectile is fitted independently from the others. LAWS minimizes:

S = \sum_{i=1}^{n}{ w_i(p)(y_i - \mu_i(p))^2}

with

w_i(p) = p 1_{(y_i > \mu_i(p))} + (1-p) 1_{(y_i < \mu_i(p))} .

The restricted version fits the 0.5 expectile at first and then the residuals. Afterwards the other expectiles are fitted as deviation by a factor of the residuals from the mean expectile. This algorithm is based on He(1997). The advantage is that expectile crossing cannot occur, the disadvantage is a suboptimal fit in certain heteroscedastic settings. Also, since the number of fits is significantly decreased, the restricted version is much faster.

The expectile bundle has a resemblence to the restricted regression. At first, a trend curve is fitted and then an iteration is performed between fitting the residuals and calculating the deviation factors for all the expectiles until the results are stable. Therefore this function shares the (dis)advantages of the restricted.

The expectile sheets construct a p-spline basis for the expectiles and perform a continuous fit over all expectiles by fitting the tensor product of the expectile spline basis and the basis of the covariates. In consequence there will be most likely no crossing of expectiles but also a good fit in heteroscedastic scenarios.

The function expectreg.qp also fits a sheet over all expectiles, but it uses quadratic programming with constraints, so crossing of expectiles will definitely not happen. So far the function is implemented for one nonlinear or spatial covariate and further parametric covariates. It works with all smoothing methods.

Value

An object of class 'expectreg', which is basically a list consisting of:

lambda

The final smoothing parameters for all expectiles and for all effects in a list. For the restricted and the bundle regression there are only the mean and the residual lambda.

intercepts

The intercept for each expectile.

coefficients

A matrix of all the coefficients, for each base element a row and for each expectile a column.

values

The fitted values for each observation and all expectiles, separately in a list for each effect in the model, sorted in order of ascending covariate values.

response

Vector of the response variable.

covariates

List with the values of the covariates.

formula

The formula object that was given to the function.

asymmetries

Vector of fitted expectile asymmetries as given by argument expectiles.

effects

List of characters giving the types of covariates.

helper

List of additional parameters like neighbourhood structure for spatial effects or 'phi' for kriging.

design

Complete design matrix.

bases

Bases components of each covariate.

fitted

Fitted values \hat{y} .

covmat

Covariance matrix, estimated when ci = TRUE.

diag.hatma

Diagonal of the hat matrix. Used for model selection criteria.

data

Original data

smooth_orig

Unchanged original type of smoothing.

plot, predict, resid, fitted, effects and further convenient methods are available for class 'expectreg'.

Author(s)

Fabian Otto-Sobotka
Carl von Ossietzky University Oldenburg
https://uol.de

Thomas Kneib
Georg August University Goettingen
https://www.uni-goettingen.de

Sabine Schnabel
Wageningen University and Research Centre
https://www.wur.nl

Paul Eilers
Erasmus Medical Center Rotterdam
https://www.erasmusmc.nl

Linda Schulze Waltrup, Goeran Kauermann
Ludwig Maximilians University Muenchen
https://www.lmu.de

References

Schnabel S and Eilers P (2009) Optimal expectile smoothing Computational Statistics and Data Analysis, 53:4168-4177

Sobotka F and Kneib T (2010) Geoadditive Expectile Regression Computational Statistics and Data Analysis, doi: 10.1016/j.csda.2010.11.015.

Schnabel S and Eilers P (2011) Expectile sheets for joint estimation of expectile curves (under review at Statistical Modelling)

Frasso G and Eilers P (2013) Smoothing parameter selection using the L-curve (under review)

See Also

rb, expectreg.boost

Examples

library(expectreg)
ex = expectreg.ls(dist ~ rb(speed),data=cars,smooth="b",lambda=5,expectiles=c(0.01,0.2,0.8,0.99))
ex = expectreg.ls(dist ~ rb(speed),data=cars,smooth="f",lambda=5,estimate="restricted")
plot(ex)


data("lidar", package = "SemiPar")

explaws <- expectreg.ls(logratio~rb(range,"pspline"),data=lidar,smooth="gcv",
                        expectiles=c(0.05,0.5,0.95))
print(explaws)
plot(explaws)

###expectile regression using a fixed penalty
plot(expectreg.ls(logratio~rb(range,"pspline"),data=lidar,smooth="fixed",
     lambda=1,expectiles=c(0.05,0.25,0.75,0.95)))
plot(expectreg.ls(logratio~rb(range,"pspline"),data=lidar,smooth="fixed",
     lambda=0.0000001,expectiles=c(0.05,0.25,0.75,0.95)))
    #As can be seen in the plot, a too small penalty causes overfitting of the data.
plot(expectreg.ls(logratio~rb(range,"pspline"),data=lidar,smooth="fixed",
     lambda=50,expectiles=c(0.05,0.25,0.75,0.95)))
    #If the penalty parameter is chosen too large, 
    #the expectile curves are smooth but don't represent the data anymore.

expectreg documentation built on May 29, 2024, 6:12 a.m.