# fitCopula: Fitting Copulas to Data - Copula Parameter Estimation In copula: Multivariate Dependence with Copulas

 fitCopula R Documentation

## Fitting Copulas to Data – Copula Parameter Estimation

### Description

Parameter estimation of copulas, i.e., fitting of a copula model to multivariate (possibly “pseudo”) observations.

### Usage

```loglikCopula(param = getTheta(copula), u, copula,
error = c("-Inf", "warn-Inf", "let-it-be"))

loglikCopulaMany(pList, u, copula)

## Generic [and "rotCopula" method] : %- ../R/fitCopula.R
fitCopula(copula, data, ...)
## S4 method for signature 'parCopula'
fitCopula(copula, data,
method = c("mpl", "ml", "itau", "irho", "itau.mpl"),
posDef = is(copula, "ellipCopula"),
start = NULL, lower = NULL, upper = NULL,
optim.method = optimMeth(copula, method, dim = d),
optim.control = list(maxit=1000),
estimate.variance = NA, hideWarnings = FALSE, ...)

optimMeth(copula, method, dim)
```

### Arguments

 `param` vector of free (see `isFree()` and `getTheta()`) parameter values. `pList` a `list` of free parameter vectors (as `param` above). In the 1D case, `length(param) == 1`, may also be a numeric vector. `u` n x d-matrix of (pseudo-)observations in [0,1]^d for computing the copula log-likelihood, where n denotes the sample size and d the dimension. Consider applying the function `pobs()` first in order to obtain such data. `data` as u, an n x d-matrix of data. For `method` being `"mpl"`, `"ml"` or `"itau.mpl"`, this has to be data in [0,1]^d. For `method` being `"itau"` or `"irho"`, it can either be data in [0,1]^d or in the whole d-dimensional space. `copula` a `"copula"` object. `error` (for `loglikCopula()`:) a `character` string specifying how errors in the underlying `dCopula()` calls should be handled: `"-Inf"`:the value of the log likelihood should silently be set to `-Inf`. `"warn-Inf"`:signal a `warning` about the error and set the value to `-Inf`. `"let-it-be"`:the error is signalled and hence the likelihood computation fails. `method` a `character` string specifying the copula parameter estimator used. This can be one of: "mpl"Maximum pseudo-likelihood estimator (based on “pseudo-observations” in [0,1]^d, typical obtained via `pobs()`). "ml"As `"mpl"` just with a different variance estimator. For this to be correct (thus giving the true MLE), `data` are assumed to be observations from the true underlying copula whose parameter is to be estimated. "itau"Inversion of Kendall's tau estimator. `data` can be either in [0,1]^d (true or pseudo-observations of the underlying copula to be estimated) or in the d-dimensional space. "irho"As `"itau"` just with Spearman's rho instead of Kendall's tau. "itau.mpl"This is the estimator of t copula parameters suggested by Mashal and Zeevi (2002) based on the idea of inverting Kendall's tau for estimating the correlation matrix as introduced in a RiskLab report in 2001 later published as Embrechts et al. (2003); see also Demarta and McNeil (2005). The given `data` has to be in [0,1]^d (either true or pseudo-observations of the underlying copula to be estimated). Note that this method requires `dispstr = "un"`. `posDef` a `logical` indicating whether a proper correlation matrix is computed. `start` a `vector` of starting values for the parameter optimization via `optim()`. `lower, upper` Lower or upper parameter bounds for the optimization methods `"Brent"` or `"L-BFGS-B"`. `optim.control` a `list` of control parameters passed to `optim(*, control=optim.control)`. `optim.method` a character string specify the optimization method or a `function` which when called with arguments `(copula, method, dim)` will return such a character string, see `optim()`'s `method`; only used when `method = "mpl"` or `"ml"`. The default has been changed (for copula 0.999-16, in Aug. 2016) from `"BFGS"` to the result of `optimMeth(copula, method, dim)` which is often `"L-BFGS-B"`. `dim` integer, the data and copula dimension, d >= 2. `estimate.variance` a `logical` indicating whether the estimator's asymptotic variance is computed (if available for the given `copula`; the default `NA` computes it for the `method`s `"itau"` and `"irho"`, cannot (yet) compute it for `"itau.mpl"` and only computes it for `"mpl"` or `"ml"` if the optimization converged). `hideWarnings` a `logical`, which, if `TRUE`, suppresses warnings from the involved likelihood maximization (typically when the likelihood is evaluated at invalid parameter values). `...` additional arguments passed to `method` specific auxiliary functions, e.g., `traceOpt = TRUE` (or `traceOpt = 10`) for tracing `optimize` (every 10-th function evaluation) for method `"itau.mpl"`, and for “manual” tracing with method `"ml"` or `"mpl"` also showing parameter values (notably for `optim.method="Brent"`), see the extra arguments of namespace-hidden function `fitCopula.ml()`.

### Details

The only difference between `"mpl"` and `"ml"` is in the variance-covariance estimate, not in the parameter (θ) estimates.

If method `"mpl"` in `fitCopula()` is used and if `start` is not assigned a value, estimates obtained from method `"itau"` are used as initial values in the optimization. Standard errors are computed as explained in Genest, Ghoudi and Rivest (1995); see also Kojadinovic and Yan (2010, Section 3). Their estimation requires the computation of certain partial derivatives of the (log) density. These have been implemented for six copula families thus far: the Clayton, Gumbel-Hougaard, Frank, Plackett, normal and t copula families. For other families, numerical differentiation based on `grad()` from package numDeriv is used (and a warning message is displayed).

In the multiparameter elliptical case and when the estimation is based on Kendall's tau or Spearman's rho, the estimated correlation matrix may not always be positive-definite. In that case, `nearPD(*, corr=TRUE)` (from Matrix) is applied to get a proper correlation matrix.

For normal and t copulas, `fitCopula(, method = "mpl")` and `fitCopula(, method = "ml")` maximize the log-likelihood based on mvtnorm's `dmvnorm()` and `dmvt()`, respectively. The latter two functions set the respective densities to zero if the correlation matrices of the corresponding distributions are not positive definite. As such, the estimated correlation matrices will be positive definite.

If methods `"itau"` or `"irho"` are used in `fitCopula()`, an estimate of the asymptotic variance (if available for the copula under consideration) will be correctly computed only if the argument `data` consists of pseudo-observations (see `pobs()`).

Consider the t copula with `df.fixed=FALSE` (see `ellipCopula()`). In this case, the methods `"itau"` and `"irho"` cannot be used in `fitCopula()` as they cannot estimate the degrees of freedom parameter `df`. For the methods `"mpl"` and `"itau.mpl"` the asymptotic variance cannot be (fully) estimated (yet). For the methods `"ml"` and `"mpl"`, when `start` is not specified, the starting value for `df` is set to `copula@df`, typically 4.

To implement the Inference Functions for Margins (IFM) method (see, e.g., Joe 2005), set `method="ml"` and note that `data` need to be parametric pseudo-observations obtained from fitted parametric marginal distribution functions. The returned large-sample variance will then underestimate the true variance (as the procedure cannot take into account the (unknown) estimation error for the margins).

The fitting procedures based on `optim()` generate warnings because invalid parameter values are tried during the optimization process. When the number of parameters is one and the parameter space is bounded, using `optim.method="Brent"` is likely to give less warnings. Furthermore, from experience, `optim.method="Nelder-Mead"` is sometimes a more robust alternative to `optim.method="BFGS"` or `"L-BFGS-B"`.

There are methods for `vcov()`, `coef()`, `logLik()`, and `nobs()`.

### Value

`loglikCopula()` returns the copula log-likelihood evaluated at the parameter (vector) `param` given the data `u`.

`loglikCopulaMany()` returns a numeric vector of such log-likelihoods; it assumes consistent parameter values, corresponding to `loglikCopula()`'s `error = "let-it-be"`, for speed.

The return value of `fitCopula()` is an object of class `"fitCopula"` (inheriting from hidden class `"fittedMV"`), containing (among others!) the slots

estimate

The parameter estimates.

var.est

The large-sample (i.e., asymptotic) variance estimate of the parameter estimator unless `estimate.variance=FALSE` where it is `matrix(numeric(), 0,0)` (to be distinguishable from cases when the covariance estimates failed partially).

copula

The fitted copula object.

The `summary()` method for `"fitCopula"` objects returns an S3 “class” `"summary.fitCopula"`, which is simply a list with components `method`, `loglik` and `convergence`, all three from the corresponding slots of the `"fitCopula"` objects, and `coefficients` (a matrix of estimated coefficients, standard errors, t values and p-values).

### References

Genest, C. (1987). Frank's family of bivariate distributions. Biometrika 74, 549–555.

Genest, C. and Rivest, L.-P. (1993). Statistical inference procedures for bivariate Archimedean copulas. Journal of the American Statistical Association 88, 1034–1043.

Rousseeuw, P. and Molenberghs, G. (1993). Transformation of nonpositive semidefinite correlation matrices. Communications in Statistics: Theory and Methods 22, 965–984.

Genest, C., Ghoudi, K., and Rivest, L.-P. (1995). A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika 82, 543–552.

Joe, H. (2005). Asymptotic efficiency of the two-stage estimation method for copula-based models. Journal of Multivariate Analysis 94, 401–419.

Mashal, R. and Zeevi, A. (2002). Beyond Correlation: Extreme Co-movements Between Financial Assets. https://www0.gsb.columbia.edu/faculty/azeevi/PAPERS/BeyondCorrelation.pdf (2016-04-05)

Demarta, S. and McNeil, A. J. (2005). The t copula and related copulas. International Statistical Review 73, 111–129.

Genest, C. and Favre, A.-C. (2007). Everything you always wanted to know about copula modeling but were afraid to ask. Journal of Hydrologic Engineering 12, 347–368.

Kojadinovic, I. and Yan, J. (2010). Comparison of three semiparametric methods for estimating dependence parameters in copula models. Insurance: Mathematics and Economics 47, 52–63.

`Copula`, `fitMvdc` for fitting multivariate distributions including the margins, `gofCopula` for goodness-of-fit tests.

For maximum likelihood of (nested) Archimedean copulas, see `emle`, etc.

### Examples

```(Xtras <- copula:::doExtras()) # determine whether examples will be extra (long)
n <- if(Xtras) 200 else 64 # sample size

## A Gumbel copula
set.seed(7) # for reproducibility
gumbel.cop <- gumbelCopula(3, dim=2)
x <- rCopula(n, gumbel.cop) # "true" observations (simulated)
u <- pobs(x)                # pseudo-observations
## Inverting Kendall's tau
fit.tau <- fitCopula(gumbelCopula(), u, method="itau")
fit.tau
confint(fit.tau) # work fine !
confint(fit.tau, level = 0.98)
summary(fit.tau) # a bit more, notably "Std. Error"s
coef(fit.tau)# named vector
coef(fit.tau, SE = TRUE)# matrix

## Inverting Spearman's rho
fit.rho <- fitCopula(gumbelCopula(), u, method="irho")
summary(fit.rho)
## Maximum pseudo-likelihood
fit.mpl <- fitCopula(gumbelCopula(), u, method="mpl")
fit.mpl
## Maximum likelihood -- use 'x', not 'u' ! --
fit.ml <- fitCopula(gumbelCopula(), x, method="ml")
summary(fit.ml) # now prints a bit more than simple 'fit.ml'
## ... and what's the log likelihood (in two different ways):
(ll. <- logLik(fit.ml))
stopifnot(all.equal(as.numeric(ll.),
loglikCopula(coef(fit.ml), u=x, copula=gumbel.cop)))

## A Gauss/normal copula

## With multiple/*un*constrained parameters
set.seed(6) # for reproducibility
normal.cop <- normalCopula(c(0.6, 0.36, 0.6), dim=3, dispstr="un")
x <- rCopula(n, normal.cop) # "true" observations (simulated)
u <- pobs(x)                # pseudo-observations
## Inverting Kendall's tau
fit.tau <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="itau")
fit.tau
## Inverting Spearman's rho
fit.rho <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="irho")
fit.rho
## Maximum pseudo-likelihood
fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="mpl")
summary(fit.mpl)
coef(fit.mpl) # named vector
coef(fit.mpl, SE = TRUE) # the matrix, with SE
## Maximum likelihood (use 'x', not 'u' !)
fit.ml <- fitCopula(normalCopula(dim=3, dispstr="un"), x, method="ml", traceOpt=TRUE)
summary(fit.ml)
confint(fit.ml)
confint(fit.ml, level = 0.999) # clearly non-0

## Fix some of the parameters
param <- c(.6, .3, NA_real_)
attr(param, "fixed") <- c(TRUE, FALSE, FALSE)
ncp <- normalCopula(param = param, dim = 3, dispstr = "un")
fixedParam(ncp) <- c(TRUE, TRUE, FALSE)
## 'traceOpt = 5': showing every 5-th log likelihood evaluation:
summary(Fxf.mpl <- fitCopula(ncp, u, method = "mpl", traceOpt = 5))
Fxf.mpl@copula # reminding of the fixed param. values

## With dispstr = "toep" :
normal.cop.toep <- normalCopula(c(0, 0), dim=3, dispstr="toep")
## Inverting Kendall's tau
fit.tau <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="itau")
fit.tau
## Inverting Spearman's rho
fit.rho <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="irho")
summary(fit.rho)
## Maximum pseudo-likelihood
fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="mpl")
fit.mpl
## Maximum likelihood (use 'x', not 'u' !)
fit.ml <- fitCopula(normalCopula(dim=3, dispstr="toep"), x, method="ml")
summary(fit.ml)

## With dispstr = "ar1"
normal.cop.ar1 <- normalCopula(c(0), dim=3, dispstr="ar1")
## Inverting Kendall's tau
summary(fit.tau <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="itau"))
## Inverting Spearman's rho
summary(fit.rho <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="irho"))
## Maximum pseudo-likelihood
summary(fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="mpl"))
## Maximum likelihood (use 'x', not 'u' !)
fit.ml <- fitCopula(normalCopula(dim=3, dispstr="ar1"), x, method="ml")
summary(fit.ml)

## A t copula with variable df (df.fixed=FALSE)
(tCop <- tCopula(c(0.2,0.4,0.6), dim=3, dispstr="un", df=5))
set.seed(101)
x <- rCopula(n, tCop) # "true" observations (simulated)
## Maximum likelihood (start = (rho[1:3], df))
summary(tc.ml <- fitCopula(tCopula(dim=3, dispstr="un"), x, method="ml",
start = c(0,0,0, 10)))
## Maximum pseudo-likelihood (the asymptotic variance cannot be estimated)
u <- pobs(x)          # pseudo-observations
tc.mpl <- fitCopula(tCopula(dim=3, dispstr="un"),
u, method="mpl", estimate.variance=FALSE,
start= c(0,0,0, 10))
summary(tc.mpl)
```

copula documentation built on June 15, 2022, 5:07 p.m.