library(copsupp)
library(CopulaModel)

The copsupp package contains functions meant as an extension to the CopulaModel package, largely to make working with vines easier.

User-friendliness is the aim: the user should not have to deal with many objects, and the functions should not be picky about what they accept. This way, the user can focus on the big picture without worrying about extraneous objects and making "work-arounds" on the objects.

The goal of this vignette is to familiarize the user with package functions and the objects recognized by those functions, while introducing the terminology used.

Objects Summary

Here are the objects that copsupp functions recognize:

Vine Arrays

Like the CopulaModel package, the vine arrays that copsupp functions recognize are matrices with the variable labels on the diagonal, and the row number corresponds to tree depth in the vine. What differs is that these vine arrays can also be truncated: an ntrunc-truncated vine array removes rows in the vine array after row ntrunc, except for the variable labels, which get shifted upwards.

Allowing for the vine arrays themselves to be truncated has some advantages:

There are several functions to work with vine arrays. To demonstrate, let's first build a vine array using makeuppertri.

(A <- makeuppertri(c(1, 1, 2, 3, 3, 3, 2,
                     2, 1, 2, 4, 4, 3,
                     3, 1, 2, 6, 4,
                     4, 6, 5, 7), nrow = 4, ncol = 7, incDiag = TRUE))

Truncating a Vine Array

Since it's not so simple to truncate a vine array, there's a function for that:

truncvarray(A, 2)

Variable Labels

We can extract the variable labels in the order that they're introduced in the vine array:

varray.vars(A)

This is useful because, with truncated vine arrays, the variable labels no longer strictly lie on the diagonal of the vine array, so it no longer suffices to use diag(A) to extract the variables.

Although they don't have to be, there are two ways to label the variables in a vine array:

  1. So that the labels are also the column number of the data, and/or
  2. So that the labels are also the order that the variables are introduced in the vine array (i.e. so that the labels are exactly 1:ncol(A)).

CopulaModel often insists that both interpretations of the labels hold -- which means you often need to relabel your vine arrays and subsequently permute your data to reflect the changes. copsupp only insists on the first interpretation (if at all), so that you don't have to bother permuting your data. It can do this because indicating order is redundant: it can be obtained by looking at the column number in the vine array. This opens up the possibility for the labels to be outside of the set {1:ncol(A)}, which is useful when we're looking at a vine of a subset of variables (which CopulaModel can't always handle).

If you're in a situation where you need to relabel your vine array, you can use relabel.varray():

relabel.varray(A, c(3, 4, 20, 1, 6, 7, 50))

This function is more powerful than varrayperm() in CopulaModel, because varrayperm() will not accept truncated vine arrays nor vine arrays with labels outside of the set {1:ncol(A)}.

Note that varrayperm() uses the terminology permute, as in a permutation of the variable names, but this can be easily confused with a permutation of the order of the vine array. Plus, if there are labels outside of the set {1:ncol(A)}, a relabelling may not actually be a permutation of the labels. Really, the "permute" terminology here is an allusion that you'll probably have to permute your data matrix after the relabelling.

Re-Ordering a Vine Array

Suppose you would like to convert a vine array so that a certain variable appears at the end of the vine array (bottom-right corner) -- in other words, re-leaf a vine array (I call the last variable in the vine array a leaf, so the leaves of a vine are all variables that can be written as a leaf in a vine array). If this conversion is possible, releaf.varray() can help you out:

releaf.varray(A, 1)

Now variable 1 is a leaf. How about we try variable 4?

releaf.varray(A, 4)

Variable 4 is not a leaf in the vine, so releaf.varray() returns NULL.

What if we want the vine array to be in natural order? Well, natural order doesn't exist when a vine is truncated, but there's a weaker version that can be useful, and that's to re-write the vine array so that the first ntrunc variables (where ntrunc = min(nrow(A)-1, ncol(A)-2) is the truncation level) are not leaves in the vine. I call this centering the vine array:

center.varray(A)

It's useful when re-leafing a vine array, which now only involves permuting the (ntrunc+1):ncol(A) columns.

Subsetting a Vine

If you want to obtain a vine array of a subset of variables, it may not always be possible, but if it is, rvinesubset() can help you. Let's subset variables 7, 2, 3, 4:

rvinesubset(A, c(7, 2, 3, 4))

The subsetted vine exists (note that it doesn't put the variables in the order specified). How about a vine array for 2, 3, 5?

rvinesubset(A, c(2, 3, 5))

It doesn't exist.

(Technicality: "not existing" doesn't mean that a vine can't be written for these variables -- it really means that the subset of the specified vine array doesn't exist.)

Another Type of Vine Array?

When we start allowing vine arrays to be truncated, the format gets somewhat clumsy because the variable labels are "bent". Instead, it's easier to move the variable labels to a new row on top of the vine array. In this form,

Because it's so much easier to work with, I call it a convenient vine array. However, the front-end of copsupp doesn't deal with convenient vine arrays to try and stay in line with CopulaModel. But some functions are easier to code when the arrays are in this form. Here are functions to convert between the two types of vine arrays:

(Acon <- Atocon(A))  # Convenient Vine Array
contoA(Acon)  # Back to regular Vine Array

I @exported them because they might come in handy.

Copula Matrix and Copula Parameter Matrix

To specify a vine distribution further, copsupp understands a copula matrix that goes along with a vine array, and a copula parameter matrix that goes along with a copula matrix.

A copula matrix should be upper-triangular with number of rows equal the truncation level, and number of columns equal the number of variables (=ncol(A)). The entries should be characters of the copula family names for the corresponding edge in the vine array. Let's build one for our array A:

(copmat <- makeuppertri(c(rep("gum", 6), 
                          "frk", "bvtcop", "joeu", "bvncop", "mtcjv",
                          "mtcjr", "indepcop", "indepcop", "indepcop"), 3, 7, blanks = ""))

Now let's indicate the parameter for each family in copmat as a copula parameter matrix whose entries correspond to the same entries in copmat. Notice that some families accept a different number of parameters than 1. We can put that information in the copula parameter matrix by allowing each entry to have any number of parameters, with the help of makeuppertri.list():

(cparmat <- makeuppertri.list(c(3:8/2,
                                3, 0.5, 4, 2, 0.7, 1.7,
                                1.3), 
                              len = c(1,1,1,1,1,1,1,2,1,1,1,1,0,0,0), 
                              nrow = 3, ncol = 7))

Each entry is a list of length 1 containing a vector. So we can extract:

cparmat[2, 4][[1]]
cparmat[3, 7][[1]]
cparmat[1, 2][[1]]

If all copula families have exactly 1 parameter, feel free to specify a simpler matrix using, for example, makeuppertri() instead -- copsupp functions will recognize both types of matrices.

CopulaModel typically uses a vector of parameters, along with a vector indicating the lengths of each copula parameter. But storing the copula parameters in a matrix like the one above makes it easier to extract the proper copula parameters, has less "moving parts" because it contains all the information, and is just easier to read.

You'll notice that copsupp also embraces the independence copula as an option in the vine. This is not always true with CopulaModel, which expects at least one parameter to be associated with each copula family.

Supplemental Copula Functions

The CopulaModel package comes with a suite of functions for different copula families. A family has a unique name, such as "frk" for the Frank copula, or "gum" for the Gumbel copula (as seen in the copmat matrix), and functions associated with the copula family. For a copula model named "cop", the functions available are:

Sometimes, if the upper and lower tails of the copula are asymmetric, there will be a reflected version of the copula family (the survival copula), whose family is indicated by appending an "r" to the name (such as "gumr").

However, there are two incompletions in the CopulaModel package.

  1. Some copula families are missing functions.
  2. Copula families should also be reflected horizontally and vertically to allow for negative dependence.

copsupp adds the above missing functions, and labels the horizontally-reflected (or "U-flipped") copula as "copu", and the vertically-reflected (or "V-flipped") copula as "copv", for the following copulas:

"bb1", "bb6", "bb7", "bb8", "gum", "joe", "mtcj".

Some functions for the independence copula are also completed -- but of course not reflected:

Reshaping Copula and Parameter Matrices

When you re-order, truncate, and/or subset a vine array, the corresponding copula matrix and copula parameter matrix needs changing too. Use reform.copmat() to help.

(Anew <- truncvarray(rvinesubset(A, 2:6), 2))
(copmatnew <- reform.copmat(copmat, Anew, A))
(cparmatnew <- reform.copmat(cparmat, Anew, A))

Vine Distributions

Now that we have a vine distribution fully specified by three objects (the vine array, copula matrix, and copula parameter matrix), copsupp has functions that specify different distributional quantities -- just like the suite of functions that go along with a copula family like "frk" or "gum". (So far, the suite of functions for a vine distribution is not as extensive.)

Simulate data:

(dat <- fvinesim(10, A = A, cops = copmat, cpars = cparmat))

The next functions call on other functions that do not accept independence copulas in the vine array. Until this is fixed, let's just demonstrate the functions by truncating the third tree:

At <- truncvarray(A, 2)
copmatt <- reform.copmat(copmat, At, A)
cparmatt <- reform.copmat(cparmat, At, A)

Evaluate the density at these data:

drvine(dat, A = At, copmat = copmatt, cparmat = cparmatt)
logdrvine(dat, A = At, copmat = copmatt, cparmat = cparmatt)

Evaluate the conditional cdf of a variable given the others. How about variable 5 given the others? 5 is a valid leaf, so there's an algorithm for that:

pcondrvine(dat, 5, A = At, copmat = copmatt, cparmat = cparmatt, .print = TRUE)

How about 3 given the others? 3 is not a valid leaf, so it doesn't use the algorithm -- it integrates the vine density:

pcondrvine(dat, 3, A = At, copmat = copmatt, cparmat = cparmatt, .print = TRUE)

There are some older functions that may become obsolete. They find the conditional cdf and quantile functions in the case that the vine is a D-vine. They are less friendly to use though -- they are pcondD(), pcondD.generic(), qcondD(), qcondD.generic().

Other Functions

Here are some other functions that are included, but may migrate somewhere else in the future because they don't seem to fit the theme of copsupp.

Marginal distributions

So far, only Uniform data were considered, and a vine distribution was specified by the vine array and the copula and parameter matrices. But we should really include the univariate marginals too. The functions logdrvine(), drvine(), pcondrvine() all accept different marginals than Uniform.

copsupp also comes with a function to fit marginal distributions to univariate data. Since copsupp's theme isn't really model-fitting, this function is sort of out-of-place and may migrate somewhere else in the future. But here's a demo anyway.

Let's get a univariate sample, any univariate sample (continuous):

x <- rnorm(100)

Let's fit an empirical distribution to it:

(empmarg <- marginal(x))

Notice that it provides the cdf, quantile function, and density. Its cdf is "softened" so that it never evaluates to 0 or 1:

empmarg$cdf(max(x) + 1)

Let's fit a Normal distribution:

marginal(x, "norm")

How about a t distribution? Notice that we need to specify a starting value in the estimation. marginal() isn't actually smart with choosing starting values -- it uses the default for the distribution, if it exists. But there's no default for Student-t, so we'll need to specify it.

marginal(x, "t", init.val = 30)

Want to fit a model to the tail (like Exponential), and put the empirical distribution below it? You can do that -- check out the cdf too:

(tmarg <- marginal(x, "exp", ecdf.split = 0))
tcdf <- tmarg$cdf
curve(tcdf, -3, 3, n = 1000)

Optimization

Do you hate it when "nlm()" tries to evaluate outside of your parameter space and it throws an error? You can use the "restricted" nlm function, "rnlm()", to make your objective function evaluate at a very large number outside of the parameter space.

Convert Copula Number to Name

In the VineCopula package, copula families are given index codes. Use copnum2name() to convert the code to the copula name:

copnum2name(c(1, 5, 9, 39))

Fit a vine or a Bayesian Network

Fit a vine to data using fit.rvine(). Fit a Bayesian Network (i.e. linking a variable sequentially with other variables) to data using fit.BN().

Multivariate Integration

Integrate more than one variable with integrate.mv().



vincenzocoia/copsupp documentation built on Aug. 23, 2020, 7:37 a.m.