2 and 3 dimensional gof test based on the in-and-out-of-sample approach

Share:

Description

gofPIOSTn tests a 2 or 3 dimensional dataset with the PIOS test for a copula. The possible copulae are "normal", "t", "gumbel", "clayton" and "frank". The parameter estimation is performed with pseudo maximum likelihood method. In case the estimation fails, inversion of Kendall's tau is used. The approximate p-values are computed with a semiparametric bootstrap, which computation can be accelerated by enabling in-build parallel computation.

Usage

1
2
3
gofPIOSTn(copula, x, M = 1000, param = 0.5, param.est = T, df = 4, df.est = T, 
          margins = "ranks", dispstr = "ex", m = 1, 
          execute.times.comp = T, processes = 1)

Arguments

copula

The copula to test for. Possible are the copulae "normal", "t", "clayton", "gumbel" and "frank".

x

A 2 or 3 dimensional matrix containing the residuals of the data.

M

Number of bootstrapping loops.

param

The parameter to be used.

param.est

Shall be either TRUE or FALSE. TRUE means that param will be estimated with a maximum likelihood estimation.

df

Degrees of freedom, if not meant to be estimated. Only necessary if tested for "t"-copula.

df.est

Indicates if df shall be estimated. Has to be either FALSE or TRUE, where TRUE means that it will be estimated.

margins

Specifies which estimation method shall be used in case that the input data are not in the range [0,1]. The default is "ranks", which is the standard approach to convert data in such a case. Alternatively can the following distributions be specified: "beta", "cauchy", Chi-squared ("chisq"), "f", "gamma", Log normal ("lnorm"), Normal ("norm"), "t", "weibull", Exponential ("exp").

dispstr

A character string specifying the type of the symmetric positive definite matrix characterizing the elliptical copula. Implemented structures are "ex" for exchangeable and "un" for unstructured, see package copula.

m

Length of blocks.

execute.times.comp

Logical. Defines if the time which the estimation most likely takes shall be computed. It'll be just given if M is at least 100.

processes

The number of parallel processes which are performed to speed up the bootstrapping. Shouldn't be higher than the number of logical processors. Please see the details.

Details

The "Tn" test is introduced in Zhang et al. (2015). It tests the H0 hypothesis

H0 : C0 in Ccal.

For the test are constructed blocks of length m out of the data. The test compares then the pseudo likelihood of the data in each block with the overall parameter and with the parameter by leaving out the data in the block. By this procedure can be determined if the data in the block influence the parameter estimation significantly. The test statistic is defined as

T = sum(sum(l(U_k^b;theta_n ) - l(U_k^b;theta_n^(-b) ), k=1, ...,m ), b=1, ...,B)

with the pseudo observations U[ij] for i = 1, ...,n; j = 1, ...,d and

theta_n = arg max_theta sum(l(U_i; theta), i=1, ..., n)

and

theta_n^(-b) = arg max_theta sum(sum(l(U_i^(b^'); theta), i=1, ..., m), b^'=1, ..., M, b^' != b), b = 1, ..., M.

The approximate p-value is computed by the formula

sum(|T[b]| >= |T|, b=1, .., M) / M,

The applied estimation method is the two-step pseudo maximum likelihood approach, see Genest and Rivest (1995).

For small values of M, initializing the parallization via processes does not make sense. The registration of the parallel processes increases the computation time. Please consider to enable parallelization just for high values of M.

Value

A object of the class gofCOP with the components

method

a character which informs about the performed analysis

statistic

value of the test statistic

p.value

the approximate p-value

References

Zhang, S., Okhrin, O., Zhou, Q., and Song, P.. Goodness-of-fit Test For Specification of Semiparametric Copula Dependence Models. under revision in Journal of Econometrics from 15.01.2014 http://sfb649.wiwi.hu-berlin.de/papers/pdf/SFB649DP2013-041.pdf

Genest, C., K. G. and Rivest, L.-P. (1995). A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika, 82:534-552

Examples

1
2
3
data(IndexReturns)

gofPIOSTn("normal", IndexReturns[c(1:100),c(1:2)], M = 20)