Gof test using the Anderson-Darling test statistic and the gamma distribution

Share:

Description

gofRosenblattGamma contains the RosenblattGamma gof tests for copulae, described in Genest (2009) and Hofert (2014), and compares the empirical copula against a parametric estimate of the copula derived under the null hypothesis. The margins can be estimated by a bunch of distributions and the time which is necessary for the estimation can be given. The approximate p-values are computed with a parametric bootstrap, which computation can be accelerated by enabling in-build parallel computation. The gof statistics are computed with the function gofTstat from the package copula. It is possible to insert datasets of all dimensions above 1 and the possible copulae are "normal", "t", "gumbel", "clayton" and "frank". The parameter estimation is performed with pseudo maximum likelihood method. In case the estimation fails, inversion of Kendall's tau is used.

Usage

1
2
3
gofRosenblattGamma(copula, x, M = 1000, param = 0.5, param.est = T, df = 4, 
                    df.est = T, margins = "ranks", dispstr = "ex", 
                    execute.times.comp = T, processes = 1)

Arguments

copula

The copula to test for. Possible are "normal", "t", "clayton" and "gumbel".

x

A matrix containing the residuals of the data.

M

Number of bootstrapping loops.

param

The copula parameter to use, if it shall not be estimated.

param.est

Shall be either TRUE or FALSE. TRUE means that param will be estimated.

df

Degrees of freedom, if not meant to be estimated. Only necessary if tested for "t"-copula.

df.est

Indicates if df shall be estimated. Has to be either FALSE or TRUE, where TRUE means that it will be estimated.

margins

Specifies which estimation method shall be used in case that the input data are not in the range [0,1]. The default is "ranks", which is the standard approach to convert data in such a case. Alternatively can the following distributions be specified: "beta", "cauchy", Chi-squared ("chisq"), "f", "gamma", Log normal ("lnorm"), Normal ("norm"), "t", "weibull", Exponential ("exp").

dispstr

A character string specifying the type of the symmetric positive definite matrix characterizing the elliptical copula. Implemented structures are "ex" for exchangeable and "un" for unstructured, see package copula.

execute.times.comp

Logical. Defines if the time which the estimation most likely takes shall be computed. It'll be just given if M is at least 100.

processes

The number of parallel processes which are performed to speed up the bootstrapping. Shouldn't be higher than the number of logical processors. Please see the details.

Details

As written in Hofert et al. (2014) computes this Anderson-Darling test statistic for (supposedly) U[0,1]-distributed (under H_0) random variates via the distribution function of the gamma distribution. The H0 hypothesis is

C in Ccal0

with Ccal0 as the true class of copulae under H0.

This test is based on the Rosenblatt probability integral transform which uses the mapping R : (0,1)^d -> (0,1)^d. Following Genest et al. (2009) ensures this transformation the decomposition of a random vector u in [0,1]^d with a distribution into mutually independent elements with a uniform distribution on the unit interval. The mapping provides pseudo observations E[i], given by

E_1 = R(U_1), ..., E_n = R(U_n).

The mapping is performed by assigning to every vector u for e[1] = u[1] and for i in {2, ..., d},

e[i] = (d^(i-1) C(u[1], ..., u[i], 1, ..., 1))/(d u[1] ... d u[i-1]) / (d^(i-1) C(u[1], ..., u[i-1], 1, ..., 1))/(d u[1] ... d u[i-1]).

The Anderson-Darling test statistic of the variates

G(x[j]) = pgamma(x[j], shape=d)

is computed (via ADGofTest::ad.test), where x[j] = -log(e_{1j})-...-log(e_{dj}), pgamma( . ,shape=d) denotes the distribution function of the gamma distribution with shape parameter d and shape parameter one (being equal to an Erlang(d) distribution function).

The test statistic is then given by

T = -n - sum((2j - 1)/n [ln(G(x[j])) + ln(1 - G(x[n+1-j]))], j = 1, ..., n).

The approximate p-value is computed by the formula,

sum(|T[b]| >= |T|, b=1, .., M) / M,

where T and T[b] denote the test statistic and the bootstrapped test statistc, respectively.

For small values of M, initializing the parallization via processes does not make sense. The registration of the parallel processes increases the computation time. Please consider to enable parallelization just for high values of M.

Value

A object of the class gofCOP with the components

method

a character which informs about the performed analysis

erg.tests

a matrix with the p-value and test statistic of test

References

Christian Genest, Bruno Remillard, David Beaudoin (2009). Goodness-of-fit tests for copulas: A review and a power study. Insurance: Mathematics and Economics, Volume 44, Issue 2, April 2009, Pages 199-213, ISSN 0167-6687. http://dx.doi.org/10.1016/j.insmatheco.2007.10.005

Marius Hofert, Ivan Kojadinovic, Martin Maechler, Jun Yan (2014). copula: Multivariate Dependence with Copulas. R package version 0.999-15.. https://cran.r-project.org/package=copula

Examples

1
2
3
data(IndexReturns)

gofRosenblattGamma("normal", IndexReturns[c(1:100),c(1:2)], M = 20)