np.copula: Kernel Copula Estimation with Mixed Data Types
In JeffreyRacine/R-Package-np: Nonparametric Kernel Smoothing Methods for Mixed Data Types

npcopula

R Documentation

Kernel Copula Estimation with Mixed Data Types

Description

npcopula implements the nonparametric mixed data kernel copula approach of Racine (2015) for an arbitrary number of dimensions

Usage

npcopula(bws,
         data,
         u = NULL,
         n.quasi.inv = 1000,
         er.quasi.inv = 1)

Arguments

`bws`	an unconditional joint distribution (`npudistbw`) or joint density (`npudensbw`) bandwidth object (if `bws` is delivered via `npudistbw` the copula distribution is estimated, while if `bws` is delivered via `npudensbw` the copula density is estimated)
`data`	a data frame containing variables used to construct `bws`
`u`	an optional matrix of real numbers lying in [0,1], each column of which corresponds to the vector of uth quantile values desired for each variable in the copula (otherwise the u values returned are those corresponding to the sample realizations)
`n.quasi.inv`	number of grid points generated when `u` is provided in order to compute the quasi-inverse of each marginal distribution (see details)
`er.quasi.inv`	number passed to `extendrange` when `u` is provided specifying the fraction by which the data range should be extended when constructing the grid used to compute the quasi-inverse (see details)

Details

npcopula computes the nonparametric copula or copula density using inversion (Nelsen (2006), page 51). For the inversion approach, we exploit Sklar's theorem (Corollary 2.3.7, Nelsen (2006)) to produce copulas directly from the joint distribution function using C(u,v) = H(F^{-1}(u),G^{-1}(v)) rather than the typical approach that instead uses H(x,y) = C(F(x),G(y)). Whereas the latter requires kernel density estimation on a d-dimensional unit hypercube which necessitates the use of boundary correction methods, the former does not.

Note that if u is provided then expand.grid is called on u. As the dimension increases this can become unwieldy and potentially consume an enormous amount of memory unless the number of grid points is kept very small. Given that computing the copula on a grid is typically done for graphical purposes, providing u is typically done for two-dimensional problems only. Even here, however, providing a grid of length 100 will expand into a matrix of dimension 10000 by 2 which, though not memory intensive, may be computationally burdensome.

The ‘quasi-inverse’ is computed via Definition 2.3.6 from Nelsen (2006). We compute an equi-quantile grid on the range of the data of length n.quasi.inv/2. We then extend the range of the data by the factor er.quasi.inv and compute an equi-spaced grid of points of length n.quasi.inv/2 (e.g. using the default er.quasi.inv=1 we go from the minimum data value minus 1\times the range to the maximum data value plus 1\times the range for each marginal). We then take these two grids, concatenate and sort, and these form the final grid of length n.quasi.inv for computing the quasi-inverse.

Note that if u is provided and any elements of (the columns of) u are such that they lie beyond the respective values of F for the evaluation data for the respective marginal, such values are reset to the minimum/maximum values of F for the respective marginal. It is therefore prudent to inspect the values of u returned by npcopula when u is provided.

Note that copula are only defined for data of type numeric or ordered.

Value

npcopula returns an object of type data.frame with the following components

`copula`	the copula (bandwidth object obtained from `npudistbw`) or copula density (bandwidth object obtained from `npudensbw`)
`u`	the matrix of marginal u values associated with the sample realizations (`u=NULL`) or those created via `expand.grid` when `u` is provided
`data`	the matrix of marginal quantiles constructed when `u` is provided (`data` returned has the same names as `data` inputted)

Usage Issues

See the example below for proper usage.

Author(s)

Jeffrey S. Racine racinej@mcmaster.ca

References

Nelsen, R. B. (2006), An Introduction to Copulas, Second Edition, Springer-Verlag.

Racine, J.S. (2015), “Mixed Data Kernel Copulas,” Empirical Economics, 48, 37-59.

Examples

## Not run: 
## Example 1: Bivariate Mixed Data

require(MASS)

set.seed(42)

## Simulate correlated Gaussian data (rho(x,y)=0.99)

n <- 1000
n.eval <- 100
rho <- 0.99
mu <- c(0,0)
Sigma <- matrix(c(1,rho,rho,1),2,2)
mydat <- mvrnorm(n=n, mu, Sigma)
mydat <- data.frame(x=mydat[,1],
                    y=ordered(as.integer(cut(mydat[,2],
                      quantile(mydat[,2],seq(0,1,by=.1)),
                      include.lowest=TRUE))-1))
q.min <- 0.0
q.max <- 1.0
grid.seq <- seq(q.min,q.max,length=n.eval)
grid.dat <- cbind(grid.seq,grid.seq)

## Estimate the copula (bw object obtained from npudistbw())

bw.cdf <- npudistbw(~x+y,data=mydat)
copula <- npcopula(bws=bw.cdf,data=mydat,u=grid.dat)

## Plot the copula


contour(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
        xlab="u1",
        ylab="u2",
        main="Copula Contour")

persp(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
      ticktype="detailed",
      xlab="u1",
      ylab="u2",
      zlab="Copula",zlim=c(0,1))

## Plot the empirical copula

copula.emp <- npcopula(bws=bw.cdf,data=mydat)

plot(copula.emp$u1,copula.emp$u2,
     xlab="u1",
     ylab="u2",
     cex=.25,
     main="Empirical Copula")

## Estimate the copula density (bw object obtained from npudensbw())

bw.pdf <- npudensbw(~x+y,data=mydat)
copula <- npcopula(bws=bw.pdf,data=mydat,u=grid.dat)

## Plot the copula density

persp(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
      ticktype="detailed",
      xlab="u1",
      ylab="u2",
      zlab="Copula Density")

## Example 2: Bivariate Continuous Data

require(MASS)

set.seed(42)

## Simulate correlated Gaussian data (rho(x,y)=0.99)

n <- 1000
n.eval <- 100
rho <- 0.99
mu <- c(0,0)
Sigma <- matrix(c(1,rho,rho,1),2,2)
mydat <- mvrnorm(n=n, mu, Sigma)
mydat <- data.frame(x=mydat[,1],y=mydat[,2])

q.min <- 0.0
q.max <- 1.0
grid.seq <- seq(q.min,q.max,length=n.eval)
grid.dat <- cbind(grid.seq,grid.seq)

## Estimate the copula (bw object obtained from npudistbw())

bw.cdf <- npudistbw(~x+y,data=mydat)
copula <- npcopula(bws=bw.cdf,data=mydat,u=grid.dat)

## Plot the copula

contour(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
        xlab="u1",
        ylab="u2",
        main="Copula Contour")

persp(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
      ticktype="detailed",
      xlab="u1",
      ylab="u2",
      zlab="Copula",
      zlim=c(0,1))

## Plot the empirical copula

copula.emp <- npcopula(bws=bw.cdf,data=mydat)

plot(copula.emp$u1,copula.emp$u2,
     xlab="u1",
     ylab="u2",
     cex=.25,
     main="Empirical Copula")

## Estimate the copula density (bw object obtained from npudensbw())

bw.pdf <- npudensbw(~x+y,data=mydat)
copula <- npcopula(bws=bw.pdf,data=mydat,u=grid.dat)

## Plot the copula density

persp(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
      ticktype="detailed",
      xlab="u1",
      ylab="u2",
      zlab="Copula Density")

## End(Not run)

JeffreyRacine/R-Package-np documentation built on Feb. 14, 2025, 7:28 a.m.