Calculating penalized copula density with penalized hierarchical B-splines

Share:

Description

Main program for estimation of copula densities with penalized hierarchical B-splines. The estimation can be done for multivariate datasets. The response is called 'y', the covariates 'x'. We estimate densities using penalized hierarchical B-splines. This is done by choosing the polynomial degree of the univariate B-spline density basis, that is built upon 2^d+1 equidistant knots. 'd' refers to the hierachy level of the marginal hierarchical B-spline and 'D' to the maximum hierarchical level of the hierarchical B-spline basis. We penalize the m-order differences of the coefficients 'b' to estimate new weights 'b'.

Usage

1
2
3
pencopula(data, d=3, D=d, q=1, base = "B-spline", max.iter = 20,
plot.bsp = FALSE, lambda = NULL, pen.order=2, adapt.grid = FALSE, add =
TRUE, alpha = 0, symmetric= TRUE, data.frame=parent.frame())

Arguments

data

'data' contains the data. 'data' has to be a matrix or a data.frame. The number of columns of 'data' is p.

d

refers to the hierachy level of the marginal hierarchical B-spline, default is d=3.

D

referes to the maximum hierachy level, default is D=3. If D<d, it follows D<-d.

q

degree of the marginal hierarchical B-spline.

base

By default, the used marginal basis is a 'B-spline'. Second possible option is 'Bernstein', using a Bernstein polynomial basis.

max.iter

maximum number of iteration, the default is max.iter=20.

plot.bsp

TRUE or FALSE, if TRUE the used B-Spline base should be plotted.

lambda

p-dimensional vector of penalty parameters, the values can be different. Default is rep(10000,p).

pen.order

The order of differences for the penalization, default is pen.order=2.

adapt.grid

Default = FALSE, if TRUE the used grid for the side condition c(u,b)>=0 is reduced.

add

Default = TRUE. Due to numerical rounding errors, some results in the quadratic programming are misleading. So we add a small epsilon in the side conditions c(u,b) >=0 to the actual weights 'b' in each iteration.

alpha

Default = 0, that results in equidistant knots. If alpha !=0 the knots are moved.

symmetric

Default = TRUE. Tuning parameter for the location of the knots if alpha != 0.

data.frame

reference to the data. Default reference is the parent.frame().

Details

The estimation of the copula density is done in several steps. 1. Preparations: Calculating the number of marginal knots 'ddb' and the number of spline coefficients 'DD' , depending on 'd' and 'D'.'Index.Basis.D' is created, that is an index set, which marginal basis will belong to the hierarchical B-spline density basis 'tilde.Psi.d.D'. 2. Creating marginal spline coefficients 'T.marg', depending on 'Index.basis.D'. 3. Building the hierarchical B-spline density basis 'tilde.PSI.d.D' 4. Calculating the knots for the estimation, by default the knots are equidistant. Transformations of the knots are possible, see 'knots.start'. Moreover a B-spline density basis 'tilde.Psi.d.knots' is created, with the knots in the hierachical order 'knots.t'. 'tilde.Psi.d.knots' is needed for the restriction, that each marginal density has to be uniform. 5. Depending on the knots, start values for the weights 'b' are calculated. This is done in 'startval.grid'. Moreover 'startval.grid' determines an grid for the side condition c(u,b)>=0 of the quadratic program. 6. Each marginal basis is estimated to be uniform. Therefore a matrix 'A.Restrict' is calculated, such that

\eqn{A.Restrict %*% b = 1}

for all p-covariates. 7. The penalty matrix P is created, see 'penalty.matrix'. 8. The first calculation of coefficients 'b' is done. 9. 'my.loop' iterates the calculation of the optimal weights 'b' until some convergence, see 'my.loop'. The maximal number of iterations are limited, default is max.iter=20. 10. If 'my.loop' terminates, the information criteria are calculated, see 'my.IC'.

Value

Returning an object of class pencopula. The class pencopula consists of the environment 'penden.env', which includes all calculated values of the estimation approach. For a fast overview of the main results, one can use the function 'print.pencopula()'.

Note

If the estimation with the current setup does not work, e.g. the quadratic program can not be solved, the routine returns log.like=0, pen.log.like=0, AIC=0 and BIC=0. Please restart the estimation, e.g. with a different penalty parameter.

Author(s)

Christian Schellhase <cschellhase@wiwi.uni-bielefeld.de>

References

Flexible Copula Density Estimation with Penalized Hierarchical B-Splines, Kauermann G., Schellhase C. and Ruppert, D. (2013), Scandinavian Journal of Statistics 40(4), 685-705.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
require(MASS)
data(Lufthansa)
data(DeutscheBank)

data.dbank <- data.lufth <- c()

dim.data <- dim(DeutscheBank)[1]

for(i in 2:dim.data) data.dbank[i-1] <- log(DeutscheBank[i,2]/DeutscheBank[i-1,2])
for(i in 2:dim.data) data.lufth[i-1] <- log(Lufthansa[i,2]/Lufthansa[i-1,2])

dbank1 <- fitdistr(data.dbank,"t")
lufth1 <- fitdistr(data.lufth,"t")

mypt <- function(x, m, s, df) pt((x-m)/s, df)

Y <- cbind(mypt(data.dbank,dbank1$estimate[1],s=dbank1$estimate[2],df=dbank1$estimate[3]),
mypt(data.lufth,lufth1$estimate[1],s=lufth1$estimate[2],df=lufth1$estimate[3]))

cop <- pencopula(Y,d=4,D=4,lambda=rep(10,2))