# cobs: COnstrained B-Splines Nonparametric Regression Quantiles In cobs: Constrained B-Splines (Sparse Matrix Based)

## Description

Computes constrained quantile curves using linear or quadratic splines. The median spline (L_1 loss) is a robust (constrained) smoother.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13``` ```cobs(x, y, constraint = c("none", "increase", "decrease", "convex", "concave", "periodic"), w = rep(1,n), knots, nknots = if(lambda == 0) 6 else 20, method = "quantile", degree = 2, tau = 0.5, lambda = 0, ic = c("AIC", "SIC", "BIC", "aic", "sic", "bic"), knots.add = FALSE, repeat.delete.add = FALSE, pointwise = NULL, keep.data = TRUE, keep.x.ps = TRUE, print.warn = TRUE, print.mesg = TRUE, trace = print.mesg, lambdaSet = exp(seq(log(lambda.lo), log(lambda.hi), length = lambda.length)), lambda.lo = f.lambda*1e-4, lambda.hi = f.lambda*1e3, lambda.length = 25, maxiter = 100, rq.tol = 1e-8, toler.kn = 1e-6, tol.0res = 1e-6, nk.start = 2) ```

## Arguments

 `x` vector of covariate; missing values are omitted. `y` vector of response variable. It must have the same length as `x`. `constraint` character (string) specifying the kind of constraint; must be one of the values in the default list above; may be abbreviated. More flexible constraints can be specified via the `pointwise` argument (below). `w` vector of weights the same length as `x` (y) assigned to both `x` and `y`; default to all weights being one. `knots` vector of locations of the knot mesh; if missing, `nknots` number of `knots` will be created using the specified `method` and automatic knot selection will be carried out for regression B-spline (`lambda=0`); if not missing and `length(knots)==nknots`, the provided knot mesh will be used in the fit and no automatic knot selection will be performed; otherwise, automatic knots selection will be performed on the provided `knots`. `nknots` maximum number of knots; defaults to 6 for regression B-splines, 20 for smoothing B-splines. `method` character string specifying the method for generating `nknots` number of `knots` when `knots` is not provided; either `"quantile"` (equally spaced in percentile levels) or `"uniform"` (equally spaced knots); defaults to "quantile". `degree` degree of the splines; 1 for linear spline (equivalent to L_1-roughness) and 2 for quadratic spline (corresponding to L_infinity ('L_oo') roughness); defaults to 2. `tau` desired quantile level; defaults to 0.5 (median). `lambda` penalty parameter λ λ = 0: no penalty (regression B-spline); λ > 0: smoothing B-spline with the given λ; λ < 0: smoothing B-spline with λ chosen by a Schwarz-type information criterion. `ic` string indicating the information criterion used in knot deletion and addition for the regression B-spline method, i.e., when `lambda == 0`; `"AIC"` (Akaike-type, equivalently `"aic"`) or `"SIC"` (Schwarz-type, equivalently `"BIC"`, `"sic"` or `"bic"`). Defaults to `"AIC"`. Note that the default was `"SIC"` up to cobs version 1.1-6 (dec.2008). `knots.add` logical indicating if an additional step of stepwise knot addition should be performed for regression B-splines. `repeat.delete.add` logical indicating if an additional step of stepwise knot deletion should be performed for regression B-splines. `pointwise` an optional three-column matrix with each row specifies one of the following constraints: `( 1,xi,yi)`:fitted value at xi will be >= yi; `(-1,xi,yi)`:fitted value at xi will be <= yi; `( 0,xi,yi)`:fitted value at xi will be = yi; `( 2,xi,yi)`:derivative of the fitted function at xi will be yi. `keep.data` logical indicating if the `x` and `y` input vectors after removing `NA`s should be kept in the result. `keep.x.ps` logical indicating if the pseudo design matrix X\~ should be returned (as sparse matrix). That is needed for interval prediction, ```predict.cobs(*, interval=..)```. `print.warn` flag for printing of interactive warning messages; true by default; set to `FALSE` if performing simulation. `print.mesg` logical flag or integer for printing of intermediate messages; true by default. Probably needs to be set to `FALSE` in simulations. `trace` integer >= 0 indicating how much diagnostics the low-level code in `drqssbc2` should print; defaults to `print.mesg`. `lambdaSet` numeric vector of lambda values to use for grid search; in that case, defaults to a geometric sequence (a “grid in log scale”) from `lambda.lo` to `lambda.hi` of length `lambda.length`. `lambda.lo, lambda.hi` lower and upper bound of the grid search for lambda (when `lambda < 0`). The defaults are meant to keep everything scale-equivariant and are hence using the factor f = s[x]^d, i.e., `f.lambda <- sd(x)^degree`. Note however that currently the underlying algorithms in package quantreg are not scale equivariant yet. `lambda.length` number of grid points in the grid search for optimal lambda. `maxiter` upper bound of the number of iterations; defaults to 100. `rq.tol` numeric convergence tolerance for the interior point algorithm called from `rq.fit.sfnc()` or `rq.fit.sfn()`. `toler.kn` numeric tolerance for shifting the boundary knots outside; defaults to 10^(-6). `tol.0res` tolerance for testing |r_i| = 0, passed to `qbsks2` and `drqssbc2`. `nk.start` number of starting knots used in automatic knot selection. Defaults to the minimum of 2 knots.

## Details

`cobs()` computes the constraint quantile smoothing B-spline with penalty when lambda is not zero.
If lambda < 0, an optimal lambda will be chosen using Schwarz type information criterion.
If lambda > 0, the supplied lambda will be used.
If lambda = 0, cobs computes the constraint quantile regression B-spline with no penalty using the provided knots or those selected by Akaike or Schwarz information criterion.

## Value

an object of class `cobs`, a list with components

 `call` the `cobs(..)` call used for creation. `tau, degree` same as input `constraint` as input (but no more abbreviated). `pointwise` as input. `coef` B-spline coefficients. `knots` the final set of knots used in the computation. `ifl` exit code := `1 + ierr` and `ierr` is the error from `rq.fit.sfnc` (package quantreg); consequently, `ifl = 1` means “success”; all other values point to algorithmic problems or failures. `icyc` length 2: number of cycles taken to achieve convergence for final lambda, and total number of cycles for all lambdas. `k` the effective dimensionality of the final fit. `k0` (usually the same) `x.ps` the pseudo design matrix X (as returned by `qbsks2`). `resid` vector of residuals from the fit. `fitted` vector of fitted values from the fit. `SSy` the sum of squares around centered `y` (e.g. for computation of R^2.) `lambda` the penalty parameter used in the final fit. `pp.lambda` vector of all lambdas used for lambda search when `lambda` < 0 on input. `pp.sic` vector of Schwarz information criteria evaluated at `pp.lambda`; note that it is not quite sure how good these are for determining an optimal `lambda`.

## References

Ng, P. and Maechler, M. (2007) A Fast and Efficient Implementation of Qualitatively Constrained Quantile Smoothing Splines, Statistical Modelling 7(4), 315-328.

Koenker, R. and Ng, P. (2005) Inequality Constrained Quantile Regression, Sankhya, The Indian Journal of Statistics 67, 418–440.

He, X. and Ng, P. (1999) COBS: Qualitatively Constrained Smoothing via Linear Programming; Computational Statistics 14, 315–337.

Koenker, R. and Ng, P. (1996) A Remark on Bartels and Conn's Linearly Constrained L1 Algorithm, ACM Transaction on Mathematical Software 22, 493–495.

Ng, P. (1996) An Algorithm for Quantile Smoothing Splines, Computational Statistics & Data Analysis 22, 99–118.

Bartels, R. and Conn A. (1980) Linearly Constrained Discrete L_1 Problems, ACM Transaction on Mathematical Software 6, 594–608.

A postscript version of the paper that describes the details of COBS can be downloaded from http://www.cba.nau.edu/pin-ng/cobs.html.

`smooth.spline` for unconstrained smoothing splines; `bs` for unconstrained (regression) B-splines.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32``` ```x <- seq(-1,3,,150) y <- (f.true <- pnorm(2*x)) + rnorm(150)/10 ## specify pointwise constraints (boundary conditions) con <- rbind(c( 1,min(x),0), # f(min(x)) >= 0 c(-1,max(x),1), # f(max(x)) <= 1 c(0, 0, 0.5))# f(0) = 0.5 ## obtain the median REGRESSION B-spline using automatically selected knots Rbs <- cobs(x,y, constraint= "increase", pointwise = con) Rbs plot(Rbs, lwd = 2.5) lines(spline(x, f.true), col = "gray40") lines(predict(cobs(x,y)), col = "blue") mtext("cobs(x,y) # completely unconstrained", 3, col= "blue") ## compute the median SMOOTHING B-spline using automatically chosen lambda Sbs <- cobs(x,y, constraint="increase", pointwise= con, lambda= -1) summary(Sbs) plot(Sbs) ## by default includes SIC ~ lambda Sb1 <- cobs(x,y, constraint="increase", pointwise= con, lambda= -1, degree = 1) summary(Sb1) plot(Sb1) plot(Sb1, which = 2) # only the data + smooth rug(Sb1\$knots, col = 4, lwd = 1.6)# (too many knots) xx <- seq(min(x) - .2, max(x)+ .2, len = 201) pxx <- predict(Sb1, xx, interval = "both") lines(pxx, col = 2) mtext(" + pointwise and simultaneous 95% - confidence intervals") matlines(pxx[,1], pxx[,-(1:2)], col= rep(c("green3","blue"), c(2,2)), lty=2) ```

### Example output    ```qbsks2():
Performing general knot selection ...

Deleting unnecessary knots ...
COBS regression spline (degree = 2) from call:
cobs(x = x, y = y, constraint = "increase", pointwise = con)
{tau=0.5}-quantile;  dimensionality of fit: 5 from {5}
x\$knots[1:4]: -1.0000040, -0.2214765,  1.3892617,  3.0000040
qbsks2():
Performing general knot selection ...

Deleting unnecessary knots ...

Searching for optimal lambda. This may take a while.
While you are waiting, here is something you can consider
to speed up the process:
(a) Use a smaller number of knots;
(b) Set lambda==0 to exclude the penalty term;
(c) Use a coarser grid by reducing the argument
'lambda.length' from the default value of 25.

WARNING!  Since the optimal lambda chosen by SIC
corresponds to the roughest possible fit, you should
plot() the returned object (which plots 'sic' against 'lambda')
and possibly consider doing one of the following:
(1) reduce 'lambda.lo', increase 'lambda.hi', increase 'lambda.length' or all of the above;
(2) modify the number of knots.

COBS smoothing spline (degree = 2) from call:
cobs(x = x, y = y, constraint = "increase", lambda = -1, pointwise = con)
{tau=0.5}-quantile;  dimensionality of fit: 12 from {22,21,20,18,17,15,13,12,11,9,7,4,3}
x\$knots[1:20]: -1.0000040, -0.8120805, -0.5973154, ... ,  3.0000040
lambda = 6.313949, selected via SIC, out of 25 ones.
with 3 pointwise constraints
coef[1:22]: -1.332889e-06,  3.130875e-02,  1.116407e-01,  2.134664e-01,  3.314315e-01, ... ,  3.499097e-01
R^2 = 91.64% ;  empirical tau (over all): 71/150 = 0.4733333 (target tau= 0.5)

Searching for optimal lambda. This may take a while.
While you are waiting, here is something you can consider
to speed up the process:
(a) Use a smaller number of knots;
(b) Set lambda==0 to exclude the penalty term;
(c) Use a coarser grid by reducing the argument
'lambda.length' from the default value of 25.

WARNING!  Since the optimal lambda chosen by SIC
corresponds to the roughest possible fit, you should
plot() the returned object (which plots 'sic' against 'lambda')
and possibly consider doing one of the following:
(1) reduce 'lambda.lo', increase 'lambda.hi', increase 'lambda.length' or all of the above;
(2) modify the number of knots.

COBS smoothing spline (degree = 1) from call:
cobs(x = x, y = y, constraint = "increase", degree = 1, lambda = -1,     pointwise = con)
{tau=0.5}-quantile;  dimensionality of fit: 6 from {21,20,18,16,14,13,6,5,4,3}
x\$knots[1:20]: -1.0000040, -0.8120805, -0.5973154, ... ,  3.0000040
lambda = 10.59621, selected via SIC, out of 25 ones.
with 3 pointwise constraints
coef[1:20]: -0.00000200,  0.09395973,  0.20134228,  0.30872483,  0.41610738, ... ,  1.00000002
R^2 = 91.3% ;  empirical tau (over all): 72/150 = 0.48 (target tau= 0.5)
```

cobs documentation built on Jan. 24, 2020, 5:10 p.m.