npcopula | R Documentation |
npcopula
implements the nonparametric mixed data kernel copula
approach of Racine (2015) for an arbitrary number of dimensions
npcopula(bws,
data,
u = NULL,
n.quasi.inv = 1000,
er.quasi.inv = 1)
bws |
an unconditional joint distribution ( |
data |
a data frame containing variables used to construct |
u |
an optional matrix of real numbers lying in [0,1], each column of which corresponds to the vector of uth quantile values desired for each variable in the copula (otherwise the u values returned are those corresponding to the sample realizations) |
n.quasi.inv |
number of grid points generated when |
er.quasi.inv |
number passed to |
npcopula
computes the nonparametric copula or copula density
using inversion (Nelsen (2006), page 51). For the inversion approach,
we exploit Sklar's theorem (Corollary 2.3.7, Nelsen (2006)) to produce
copulas directly from the joint distribution function using
C(u,v) = H(F^{-1}(u),G^{-1}(v))
rather than the typical approach
that instead uses H(x,y) = C(F(x),G(y))
. Whereas the latter
requires kernel density estimation on a d-dimensional unit hypercube
which necessitates the use of boundary correction methods, the former
does not.
Note that if u
is provided then expand.grid
is
called on u
. As the dimension increases this can become
unwieldy and potentially consume an enormous amount of memory unless
the number of grid points is kept very small. Given that computing the
copula on a grid is typically done for graphical purposes, providing
u
is typically done for two-dimensional problems only. Even
here, however, providing a grid of length 100 will expand into a
matrix of dimension 10000 by 2 which, though not memory intensive, may
be computationally burdensome.
The ‘quasi-inverse’ is computed via Definition 2.3.6 from
Nelsen (2006). We compute an equi-quantile grid on the range of the
data of length n.quasi.inv/2
. We then extend the range of the
data by the factor er.quasi.inv
and compute an equi-spaced grid
of points of length n.quasi.inv/2
(e.g. using the default
er.quasi.inv=1
we go from the minimum data value minus
1\times
the range to the maximum data value plus
1\times
the range for each marginal). We then take these two
grids, concatenate and sort, and these form the final grid of length
n.quasi.inv
for computing the quasi-inverse.
Note that if u
is provided and any elements of (the columns of)
u
are such that they lie beyond the respective values of
F
for the evaluation data for the respective marginal, such
values are reset to the minimum/maximum values of F
for the
respective marginal. It is therefore prudent to inspect the values of
u
returned by npcopula
when u
is provided.
Note that copula are only defined for data of type
numeric
or ordered
.
npcopula
returns an object of type data.frame
with the following components
copula |
the copula (bandwidth object obtained from |
u |
the matrix of marginal u values associated with the sample
realizations ( |
data |
the matrix of marginal quantiles constructed when
|
See the example below for proper usage.
Jeffrey S. Racine racinej@mcmaster.ca
Nelsen, R. B. (2006), An Introduction to Copulas, Second Edition, Springer-Verlag.
Racine, J.S. (2015), “Mixed Data Kernel Copulas,” Empirical Economics, 48, 37-59.
npudensbw,npudens,npudist
## Not run:
## Example 1: Bivariate Mixed Data
require(MASS)
set.seed(42)
## Simulate correlated Gaussian data (rho(x,y)=0.99)
n <- 1000
n.eval <- 100
rho <- 0.99
mu <- c(0,0)
Sigma <- matrix(c(1,rho,rho,1),2,2)
mydat <- mvrnorm(n=n, mu, Sigma)
mydat <- data.frame(x=mydat[,1],
y=ordered(as.integer(cut(mydat[,2],
quantile(mydat[,2],seq(0,1,by=.1)),
include.lowest=TRUE))-1))
q.min <- 0.0
q.max <- 1.0
grid.seq <- seq(q.min,q.max,length=n.eval)
grid.dat <- cbind(grid.seq,grid.seq)
## Estimate the copula (bw object obtained from npudistbw())
bw.cdf <- npudistbw(~x+y,data=mydat)
copula <- npcopula(bws=bw.cdf,data=mydat,u=grid.dat)
## Plot the copula
contour(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
xlab="u1",
ylab="u2",
main="Copula Contour")
persp(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
ticktype="detailed",
xlab="u1",
ylab="u2",
zlab="Copula",zlim=c(0,1))
## Plot the empirical copula
copula.emp <- npcopula(bws=bw.cdf,data=mydat)
plot(copula.emp$u1,copula.emp$u2,
xlab="u1",
ylab="u2",
cex=.25,
main="Empirical Copula")
## Estimate the copula density (bw object obtained from npudensbw())
bw.pdf <- npudensbw(~x+y,data=mydat)
copula <- npcopula(bws=bw.pdf,data=mydat,u=grid.dat)
## Plot the copula density
persp(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
ticktype="detailed",
xlab="u1",
ylab="u2",
zlab="Copula Density")
## Example 2: Bivariate Continuous Data
require(MASS)
set.seed(42)
## Simulate correlated Gaussian data (rho(x,y)=0.99)
n <- 1000
n.eval <- 100
rho <- 0.99
mu <- c(0,0)
Sigma <- matrix(c(1,rho,rho,1),2,2)
mydat <- mvrnorm(n=n, mu, Sigma)
mydat <- data.frame(x=mydat[,1],y=mydat[,2])
q.min <- 0.0
q.max <- 1.0
grid.seq <- seq(q.min,q.max,length=n.eval)
grid.dat <- cbind(grid.seq,grid.seq)
## Estimate the copula (bw object obtained from npudistbw())
bw.cdf <- npudistbw(~x+y,data=mydat)
copula <- npcopula(bws=bw.cdf,data=mydat,u=grid.dat)
## Plot the copula
contour(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
xlab="u1",
ylab="u2",
main="Copula Contour")
persp(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
ticktype="detailed",
xlab="u1",
ylab="u2",
zlab="Copula",
zlim=c(0,1))
## Plot the empirical copula
copula.emp <- npcopula(bws=bw.cdf,data=mydat)
plot(copula.emp$u1,copula.emp$u2,
xlab="u1",
ylab="u2",
cex=.25,
main="Empirical Copula")
## Estimate the copula density (bw object obtained from npudensbw())
bw.pdf <- npudensbw(~x+y,data=mydat)
copula <- npcopula(bws=bw.pdf,data=mydat,u=grid.dat)
## Plot the copula density
persp(grid.seq,grid.seq,matrix(copula$copula,n.eval,n.eval),
ticktype="detailed",
xlab="u1",
ylab="u2",
zlab="Copula Density")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.