QuantifQuantile for general X

Share:

Description

Estimation of conditional quantiles using optimal quantization when X is d-dimensional.

Usage

1
2
3
QuantifQuantile.d(X, Y, x, alpha = c(0.05, 0.25, 0.5, 0.75, 0.95),
  testN = c(35, 40, 45, 50, 55), p = 2, B = 50, tildeB = 20,
  same_N = TRUE, ncores = 1)

Arguments

X

matrix of covariates.

Y

vector of response variables.

x

matrix of values for x in q_alpha(x).

alpha

vector of order of the quantiles.

testN

grid of values of N that will be tested.

p

L_p norm optimal quantization.

B

number of bootstrap replications for the bootstrap estimator.

tildeB

number of bootstrap replications for the choice of N.

same_N

whether to use the same value of N for each alpha (TRUE by default).

ncores

number of cores to use. Default is set to 1 (see Details below).

Details

  • This function calculates estimated conditional quantiles with a method based on optimal quantization for any dimension for the covariate. The matrix of covariate X must have d rows (dimension). For particular cases of d =1 or 2, it is strongly recommended to use QuantifQuantile and QuantifQuantile.d2 respectively (computationally faster). The argument x must also have d rows.

  • The criterion for selecting the number of quantizers is implemented in this function. The user has to choose a grid testN of possible values in which N will be selected. It actually minimizes some bootstrap estimated version of the ISE (Integrated Squared Error). More precisely, for N fixed, it calculates the sum according to alpha of hatISE_N and then minimizes the resulting vector to get N_opt. However, the user can choose to select a different value of N_opt for each alpha by setting same_N=FALSE. In this case, the vector N_opt is obtained by minimizing each column of hatISE_N separately. The reason why same_N=TRUE by default is that taking N_opt according to alpha could provide crossing conditional quantile curves (rarely observed for not too close values of alpha). The function plot.QuantifQuantile illustrates the selection of N_opt. If the graph is not decreasing then increasing, the argument testN should be adapted.

  • This function can use parallel computation to save time, by simply increasing the parameter ncores. Parallel computation relies on mclapply from parallel package, hence is not available on Windows unless ncores=1 (default value).

Value

An object of class QuantifQuantile which is a list with the following components:

hatq_opt

A matrix containing the estimated conditional quantiles. The number of columns is the number of considered values for x and the number of rows the size of the order vector alpha. This object can also be returned using the usual fitted.values function.

N_opt

Optimal selected value for N. An integer if same_N=TRUE and a vector of integers of length length(alpha) otherwise.

hatISE_N

The matrix of estimated ISE provided by our selection criterion for N before taking the mean according to alpha. The number of columns is then length(testN) and the number of rows length(alpha).

hatq_N

A 3-dimensional array containing the estimated conditional quantiles for each considered value for alpha, x and N.

X

The matrix of covariates.

Y

The vector of response variables.

x

The considered vector of values for x in q_alpha(x).

alpha

The considered vector of order for the quantiles.

testN

The considered grid of values for N that were tested.

References

Charlier, I. and Paindaveine, D. and Saracco, J., Conditional quantile estimation through optimal quantization, Journal of Statistical Planning and Inference, 2015 (156), 14-30.

Charlier, I. and Paindaveine, D. and Saracco, J., Conditional quantile estimator based on optimal quantization: from theory to practice, Submitted.

See Also

QuantifQuantile and QuantifQuantile.d2 for particular dimensions one and two.

plot.QuantifQuantile, print.QuantifQuantile, summary.QuantifQuantile

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## Not run: 
set.seed(644925)
n <- 500
X <- runif(n,-2,2)
Y <- X^2+rnorm(n)
x <- seq(min(X),max(X),length=100)
res <- QuantifQuantile.d(X,Y,x,testN=seq(15,35,by=5))

## End(Not run)
## Not run: 
set.seed(272422)
n <- 1000
X <- matrix(runif(n*2,-2,2),ncol=n)
Y <- apply(X^2,2,sum)+rnorm(n)
x1 <- seq(min(X[1,]),max(X[1,]),length=20)
x2 <- seq(min(X[2,]),max(X[2,]),length=20)
x <- matrix(c(rep(x1,20),sort(rep(x2,20))),nrow=nrow(X),byrow=TRUE)
res <- QuantifQuantile.d(X,Y,x,testN=seq(90,140,by=10),B=20,tildeB=15)

## End(Not run)