Compute the maximum likelihood estimator of the generalised additive regression with shape constraints
Description
This function uses the active set algorithm to compute the maximum likelihood estimator (mle) of the generalised additive regression with shape constraints. Each component function of the additive predictors is assumed to belong to one of the nine possible shape restrictions. The estimator's value at the data points is unique.
The output is an object of class scar
which contains all the information
needed to plot the estimator using the plot
method, or
to evaluate it using the predict
method.
Usage
1 2 
Arguments
x 
Observed covariates in R^d, in the form of an n x d
numeric 
y 
Observed responses, in the form of a numeric 
shape 
A vector that specifies the shape restrictions for each component function, in the form of a string vector of length d. The string allowed and its corresponding shape constraint is listed as follows (see Details):

family 
A description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. Currently only the following five common exponential families are allowed: Gaussian, Binomial, Poisson, and Gamma. By default the canonical link function is used. 
weights 
An optional vector of prior weights to be used when maximising the likelihood. It is a numeric vector of length n. By default equal weights are used. 
epsilon 
Positive convergence tolerance epsilon when performing the
iteratively reweighted least squares (IRLS) method at each iteration of
the active set algorithm. See 
Details
For i=1,...,n, let X_i be the ddimensional covariates, Y_i be the corresponding onedimensional response and w_i be its weight. The generalised additive model can be written as
g(mu) = f(x),
where x=(x_1,...,x_d)^T, g is a known link function and f is an additive function (to be estimated).
Assume the canonical link function is used here, then the maximum likelihood estimator of the generalised additive model based on observations (X_1,Y_1), ..., (X_n,Y_n) is the function that maximises
[w_1 (Y_1 f(X_1) B(f(X_1))) + ... + w_n (Y_n f(X_n) B(f(X_n)))]/n
subject to the restrictions that for every j=1,...,d,
the jth additive component of f satisfies the constraint indicated by the
jth element of shape
. Here B(.) is the logpartition function of
the specified exponential family distribution, and w_i are the weights. For i.i.d. data,
w_i should be 1 for each i.
To make each component of f identifiable, we write
f(x) = f_1(x_1) + ... + f_d(x_1) + c
and let f_j(0) = 0 for every j=1,...,d. In case zero is outside the range of the jth observed covariate, for the sake of convenience, we set f_j to be zero at the sample mean of the jth predictor.
This problem can then be rewritten as a concave optimisation problem, and our function uses the active set algorithm to find out the maximum likelihood estimator. A general introduction can be found in Nocedal and Wright (2006). A detailed description of our algorithm can be found in Chen and Samworth (2014). See also Groeneboom, Jongbloed and Wellner (2008) for some theoretical supports.
Value
An object of class scar
, with the following components:
x 
Covariates copied from input. 
y 
Response copied from input. 
shape 
Shape vector copied from input. 
weights 
The vector of weights copied from input. 
family 
The exponential family copied from input. 
componentfit 
Value of the fitted component function at each observed
covariate, in the form of an n x d numeric 
constant 
The estimated value of the constant c in the additive function f (see Details). 
deviance 
Up to a constant, minus twice the maximised loglikelihood.
Where applicable, the constant is chosen to make the saturated
model to have zero deviance. See also 
nulldeviance 
The deviance for the null model. 
iter 
Total number of iterations of the active set algorithm 
.
Note
We acknowledge that glm.fit
from the R package
stats is called to perform the method of iterated reweighted least squares
(IRLS) in our routine. It is possible to speed up the implementation considerably
by simply suppressing all the runtime checks there.
If all the component functions are linear, then it is prefered to call directly
the function glm
.
For the onedimensional covariate, see the pool adjacent violators algorithm (PAVA)
of Robertson, Wright and Dykstra (1998) and the support reduction method of
Groeneboom, Jongbloed and Wellner (2008). See also the R package
Iso
.
A different approach to tackle this problem is to use splines. See the R package
scam
. We stress here that our approach is free
of tuning parameters while scam
is not, which
can be viewed as a major difference.
To estimate the generalised additive regression function without any shape
restrictions, see Wood (2004) and Hastie and Tibshirani (1990).
Their corresponding R implementations are mgcv
and gam
.
Author(s)
Yining Chen and Richard Samworth
References
Chen, Y. and Samworth, R. J. (2014). Generalised additive and index models with shape constraints. arXiv:1404.2957.
Groeneboom, P., Jongbloed, G. and Wellner, J.A. (2008). The support reduction algorithm for computing nonparametric function estimates in mixture models. Scandinavian Journal of Statistics, 35, 385399.
Hastie, T. and Tibshirani, R. (1990). Generalized Additive Models. Chapman and Hall, London.
Meyer, M. C. (2013). Semiparametric additive constrained regression. Journal of nonparametric statistics, 25, 715743.
Nocedal, J., and Wright, S. J. (2006). Numerical Optimization, 2nd edition. Springer, New York.
Robertson, T., Wright, F. T. and Dykstra, R. L. (1988). Order Restricted Statistical Inference. Wiley, New York.
Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. Springer, New York.
Wood, S.N. (2004). Stable and efficient multiple smoothing parameter estimation for generalized additive models. Journal of American Statistical Association, 99, 673686.
See Also
plot.scar
, predict.scar
, scair
,
Iso
,
scam
,
mgcv
,
gam
,
glm
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29  ## An example in the Poission additive regression setting:
## Define the additive function f on the scale of the predictors
f<function(x){
return(1*abs(x[,1]) + 2*(x[,2])^2 + 3*(abs(x[,3]))^3)
}
## Simulate the covariates and the responses
## covariates are drawn uniformly from [1,1]^3
set.seed(0)
d = 3
n = 500
x = matrix(runif(n*d)*21,nrow=n,ncol=d)
rpoisson < function(m){rpois(1,exp(m))}
y = sapply(f(x),rpoisson)
## All the components are convex so one can use scar
shape=c("cvx","cvx","cvx")
object = scar(x,y,shape=shape, family=poisson())
## Plot each component of the estimatied additive function
plot(object)
## Evaluate the estimated additive function at 10^4 random points
## drawing from the interior of the support
testx = matrix((runif(10000*d)*1.960.98),ncol=d)
testf = predict(object,testx)
## and calculate the (estimated) absolute prediction error
mean(abs(testff(testx)))
