plaqr: Partially Linear Additive Quantile Regression
In plaqr: Partially Linear Additive Quantile Regression

Description Usage Arguments Value Author(s) References Examples

View source: R/plaqr.R

Returns an object of class "plaqr" and "rq" that represents a quantile regression fit. A nonlinear term z is transformed using bs(z) before fitting the model. The formula of the model (as it appears in R) becomes y~ x1 + x2 + bs(z1) + bs(z2) where bs(z1) is a B-spline.

1
2
3

plaqr(formula, nonlinVars=NULL, tau=.5, data=NULL, subset,   
            weights, na.action, method = "br", model = TRUE, 
            contrasts = NULL, splinesettings=NULL, ...)

`formula`	a formula object, with the response on the left of a `~` operator, and the linear terms, separated by `+` operators, on the right. Any terms on the right of the `~` operator that also appear in `nonlinVars` will be included in the model as spline terms, not linear terms.
`nonlinVars`	a one-sided formula object, with a `~` operator to the left of the nonlinear terms seperated by `+` operators. A term appearing in both `formula` and `nonlinVars` will be treated as a nonlinear term. If `nonlinVars` is not `NULL`, then an intercept will automatically be included in the model (despite a `-1` or `0` term included in `formula`).
`tau`	the quantile to be estimated, this is a number strictly between 0 and 1 (for now).
`data`	a data.frame in which to interpret the variables named in the formula, or in the subset and the weights argument. If this is missing, then the variables in the formula should be on the search list. This may also be a single number to handle some special cases – see below for details.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`weights`	vector of observation weights; if supplied, the algorithm fits to minimize the sum of the weights multiplied into the absolute residuals. The length of weights must be the same as the number of observations. The weights must be nonnegative and it is strongly recommended that they be strictly positive, since zero weights are ambiguous.
`na.action`	a function to filter missing data. This is applied to the model.frame after any subset argument has been used. The default (with `na.fail`) is to create an error if any missing values are found. A possible alternative is `na.omit`, which deletes observations that contain one or more missing values.
`model`	if TRUE then the model frame is returned. This is essential if one wants to call summary subsequently.
`method`	the algorithmic method used to compute the fit. There are several options: The default method is the modified version of the Barrodale and Roberts algorithm for l1-regression, used by `l1fit` in S, and is described in detail in Koenker and d'Orey(1987, 1994), default = `"br"`. This is quite efficient for problems up to several thousand observations, and may be used to compute the full quantile regression process. It also implements a scheme for computing confidence intervals for the estimated parameters, based on inversion of a rank test described in Koenker(1994). For larger problems it is advantagous to use the Frisch–Newton interior point method `"fn"`. And very large problems one can use the Frisch–Newton approach after preprocessing `"pfn"`. Both of the latter methods are described in detail in Portnoy and Koenker(1997). There is a fifth option `"fnc"` that enables the user to specify linear inequality constraints on the fitted coefficients; in this case one needs to specify the matrix `R` and the vector `r` representing the constraints in the form Rb ≥q r. See the examples. Finally, there are two penalized methods: `"lasso"` and `"scad"` that implement the lasso penalty and Fan and Li's smoothly clipped absolute deviation penalty, respectively. These methods should probably be regarded as experimental.
`contrasts`	a list giving contrasts for some or all of the factors default = `NULL` appearing in the model formula. The elements of the list should have the same name as the variable and should be either a contrast matrix (specifically, any full-rank matrix with as many rows as there are levels in the factor), or else a function to compute such a matrix given the number of levels.
`splinesettings`	a list of length equal to the number of nonlinear effects containing arguments to pass to the `bs` function for each term. Each element of the list is either `NULL` or a list with named elements correpsonding to the arguments in `bs`. If not `NULL`, the first element of splinesettings corresponds to the first nonlinear effect and so on.
`...`	additional arguments for the fitting routines (see the `rq` function in the ‘quantreg’ package ).

Returns the following:

`coefficients`	Coefficients from the fitted model
`x`	optionally the model matrix, if `x=TRUE`.
`y`	optionally the response, if `y=TRUE`.
`residuals`	the residuals from the fit.
`dual`	the vector dual variables from the fit.
`fitted.values`	fitted values from the fit.
`formula`	the formula that was used in the `rq` function.
`rho`	the value of the objective function at the solution.
`model`	optionally the model frame, if `model=TRUE`
`linear`	the linear terms used in the model fit.
`nonlinear`	the nonlinear terms used in the model fit.
`z`	the values of the nonlinear terms.

Adam Maidman

Hastie, T. J. (1992) Generalized additive models. Chapter 7 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

Koenker, R. W. (2005). Quantile Regression, Cambridge U. Press.

Sherwood, B. and Wang, L. (2016). Partially linear additive quantile regression in ultra-high dimension. The Annals of Statistics 44, 288-317.

Maidman, A., Wang, L. (2017). New Semiparametric Method for Predicting High-Cost Patients. Preprint.

data(simData)

ss <- vector("list", 2)
ss[[2]]$degree <- 5
ss[[2]]$Boundary.knots <- c(-1, 1)

plaqr(y~., nonlinVars=~z1+z2, data=simData) 
#same as plaqr(formula= y~x1+x2+x3, nonlinVars=~z1+z2, data=simData)

plaqr(y~0, nonlinVars=~z1+z2, data=simData, splinesettings=ss) #no linear terms in the model

plaqr(y~., data=simData) #all linear terms

Loading required package: quantreg
Loading required package: SparseM

Attaching package: 'SparseM'

The following object is masked from 'package:base':

    backsolve

Loading required package: splines
Call:
plaqr(formula = y ~ ., nonlinVars = ~z1 + z2, data = simData)

Coefficients:
(Intercept)          x1          x2          x3     bs(z1)1     bs(z1)2 
  -5.655792    3.502119    1.734125    1.919248   19.309354  -17.410101 
    bs(z1)3     bs(z2)1     bs(z2)2     bs(z2)3 
   2.152656    7.806139    1.278381    8.891872 

Degrees of freedom: 100 total; 90 residual
Call:
plaqr(formula = y ~ 0, nonlinVars = ~z1 + z2, data = simData, 
    splinesettings = ss)

Coefficients:
                                   (Intercept) 
                                    -3.7242930 
                                       bs(z1)1 
                                    23.0419424 
                                       bs(z1)2 
                                   -18.9673497 
                                       bs(z1)3 
                                     2.8660628 
bs(z2, degree = 5, Boundary.knots = c(-1, 1))1 
                                    -0.2675056 
bs(z2, degree = 5, Boundary.knots = c(-1, 1))2 
                                    14.0909702 
bs(z2, degree = 5, Boundary.knots = c(-1, 1))3 
                                    -9.4923466 
bs(z2, degree = 5, Boundary.knots = c(-1, 1))4 
                                     9.8296028 
bs(z2, degree = 5, Boundary.knots = c(-1, 1))5 
                                     8.2595997 

Degrees of freedom: 100 total; 91 residual
Call:
plaqr(formula = y ~ ., data = simData)

Coefficients:
(Intercept)          x1          x2          x3          z1          z2 
   4.741973    4.218701    1.719575    1.731902  -10.309600    3.800588 

Degrees of freedom: 100 total; 94 residual