gamVineCopSelect: Sequential pair-copula selection and maximum penalized...

Description Usage Arguments Value See Also Examples

View source: R/gamVineCopSelect.R

Description

This function select the copula family and estimates the parameter(s) of a Generalized Additive model (GAM) Vine model, where GAMs for individual edges are specified either for the copula parameter or Kendall's tau. It solves the maximum penalized likelihood estimation for the copula families supported in this package by reformulating each Newton-Raphson iteration as a generalized ridge regression, which is solved using the mgcv package.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
gamVineCopSelect(
  data,
  Matrix,
  lin.covs = NULL,
  smooth.covs = NULL,
  simplified = FALSE,
  familyset = NA,
  rotations = TRUE,
  familycrit = "AIC",
  level = 0.05,
  trunclevel = NA,
  tau = TRUE,
  method = "FS",
  tol.rel = 0.001,
  n.iters = 10,
  parallel = FALSE,
  verbose = FALSE,
  select.once = TRUE,
  ...
)

Arguments

data

A matrix or data frame containing the data in [0,1]^d.

Matrix

Lower triangular d x d matrix that defines the R-vine tree structure.

lin.covs

A matrix or data frame containing the parametric (i.e., linear) covariates (default: lin.covs = NULL).

smooth.covs

A matrix or data frame containing the non-parametric (i.e., smooth) covariates (default: smooth.covs = NULL).

simplified

If TRUE, then a simplified vine is fitted (which is possible only if there are exogenous covariates). If FALSE (default), then a non-simplified vine is fitted.

familyset

An integer vector of pair-copula families to select from (the independence copula MUST NOT be specified in this vector unless one wants to fit an independence vine!). The vector has to include at least one pair-copula family that allows for positive and one that allows for negative dependence. Not listed copula families might be included to better handle limit cases. If familyset = NA (default), selection among all possible families is performed. Coding of pair-copula families: 1 Gaussian, 2 Student t, 3 Clayton, 4 Gumbel, 13 Survival Clayton, 14 Survival Gumbel, 23 Rotated (90 degrees) Clayton, 24 Rotated (90 degrees) Gumbel, 33 Rotated (270 degrees) Clayton and 34 Rotated (270 degrees) Gumbel.

rotations

If TRUE, all rotations of the families in familyset are included.

familycrit

Character indicating the criterion for bivariate copula selection. Possible choices: familycrit = 'AIC' (default) or 'BIC', as in BiCopSelect from the VineCopula package.

level

Numerical; Passed to gamBiCopSelect, it is the significance level of the test for removing individual predictors (default: level = 0.05) for each conditional pair-copula.

trunclevel

Integer; level of truncation.

tau

TRUE (default) for a calibration function specified for Kendall's tau or FALSE for a calibration function specified for the Copula parameter.

method

'NR' for Newton-Raphson and 'FS' for Fisher-scoring (default).

tol.rel

Relative tolerance for 'FS'/'NR' algorithm.

n.iters

Maximal number of iterations for 'FS'/'NR' algorithm.

parallel

TRUE (default) for parallel selection of copula family at each edge or FALSE for the sequential version. for the Copula parameter.

verbose

TRUE if informations should be printed during the estimation and FALSE (default) for a silent version. from mgcv.

select.once

if TRUE the GAM structure is only selected once, for the family that appears first in familyset.

...

Additional parameters to be passed to gam from mgcv.

Value

gamVineCopSelect returns a gamVine-class object.

See Also

gamVineSeqFit,gamVineStructureSelect, gamVine-class, gamVineSimulate and gamBiCopFit.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
require(mgcv)
set.seed(0)

##  Simulation parameters
# Sample size
n <- 1e3
# Copula families
familyset <- c(1:2, 301:304, 401:404)
# Define a 4-dimensional R-vine tree structure matrix
d <- 4
Matrix <- c(2, 3, 4, 1, 0, 3, 4, 1, 0, 0, 4, 1, 0, 0, 0, 1)
Matrix <- matrix(Matrix, d, d)
nnames <- paste("X", 1:d, sep = "")

## A function factory
eta0 <- 1
calib.surf <- list(
  calib.quad <- function(t, Ti = 0, Tf = 1, b = 8) {
    Tm <- (Tf - Ti) / 2
    a <- -(b / 3) * (Tf^2 - 3 * Tf * Tm + 3 * Tm^2)
    return(a + b * (t - Tm)^2)
  },
  calib.sin <- function(t, Ti = 0, Tf = 1, b = 1, f = 1) {
    a <- b * (1 - 2 * Tf * pi / (f * Tf * pi +
      cos(2 * f * pi * (Tf - Ti))
      - cos(2 * f * pi * Ti)))
    return((a + b) / 2 + (b - a) * sin(2 * f * pi * (t - Ti)) / 2)
  },
  calib.exp <- function(t, Ti = 0, Tf = 1, b = 2, s = Tf / 8) {
    Tm <- (Tf - Ti) / 2
    a <- (b * s * sqrt(2 * pi) / Tf) * (pnorm(0, Tm, s) - pnorm(Tf, Tm, s))
    return(a + b * exp(-(t - Tm)^2 / (2 * s^2)))
  }
)

##  Create the model
# Define gam-vine model list
count <- 1
model <- vector(mode = "list", length = d * (d - 1) / 2)
sel <- seq(d, d^2 - d, by = d)

# First tree
for (i in 1:(d - 1)) {
  # Select a copula family
  family <- sample(familyset, 1)
  model[[count]]$family <- family

  # Use the canonical link and a randomly generated parameter
  if (is.element(family, c(1, 2))) {
    model[[count]]$par <- tanh(rnorm(1) / 2)
    if (family == 2) {
      model[[count]]$par2 <- 2 + exp(rnorm(1))
    }
  } else {
    if (is.element(family, c(401:404))) {
      rr <- rnorm(1)
      model[[count]]$par <- sign(rr) * (1 + abs(rr))
    } else {
      model[[count]]$par <- rnorm(1)
    }
    model[[count]]$par2 <- 0
  }
  count <- count + 1
}

# A dummy dataset
data <- data.frame(u1 = runif(1e2), u2 = runif(1e2), matrix(runif(1e2 * d), 1e2, d))

# Trees 2 to (d-1)
for (j in 2:(d - 1)) {
  for (i in 1:(d - j)) {
    # Select a copula family
    family <- sample(familyset, 1)

    # Select the conditiong set and create a model formula
    cond <- nnames[sort(Matrix[(d - j + 2):d, i])]
    tmpform <- paste("~", paste(paste("s(", cond, ", k=10, bs='cr')",
      sep = ""
    ), collapse = " + "))
    l <- length(cond)
    temp <- sample(3, l, replace = TRUE)

    # Spline approximation of the true function
    m <- 1e2
    x <- matrix(seq(0, 1, length.out = m), nrow = m, ncol = 1)
    if (l != 1) {
      tmp.fct <- paste("function(x){eta0+",
        paste(sapply(1:l, function(x)
          paste("calib.surf[[", temp[x], "]](x[", x, "])",
            sep = ""
          )), collapse = "+"), "}",
        sep = ""
      )
      tmp.fct <- eval(parse(text = tmp.fct))
      x <- eval(parse(text = paste0("expand.grid(",
        paste0(rep("x", l), collapse = ","), ")",
        collapse = ""
      )))
      y <- apply(x, 1, tmp.fct)
    } else {
      tmp.fct <- function(x) eta0 + calib.surf[[temp]](x)
      colnames(x) <- cond
      y <- tmp.fct(x)
    }

    # Estimate the gam model
    form <- as.formula(paste0("y", tmpform))
    dd <- data.frame(y, x)
    names(dd) <- c("y", cond)
    b <- gam(form, data = dd)
    # plot(x[,1],(y-fitted(b))/y)

    # Create a dummy gamBiCop object
    tmp <- gamBiCopFit(data = data, formula = form, family = 1, n.iters = 1)$res

    # Update the copula family and the model coefficients
    attr(tmp, "model")$coefficients <- coefficients(b)
    attr(tmp, "model")$smooth <- b$smooth
    attr(tmp, "family") <- family
    if (family == 2) {
      attr(tmp, "par2") <- 2 + exp(rnorm(1))
    }
    model[[count]] <- tmp
    count <- count + 1
  }
}

# Create the gamVineCopula object
GVC <- gamVine(Matrix = Matrix, model = model, names = nnames)
print(GVC)
## Not run: 
## Simulate and fit the model
sim <- gamVineSimulate(n, GVC)
fitGVC <- gamVineSeqFit(sim, GVC, verbose = TRUE)
fitGVC2 <- gamVineCopSelect(sim, Matrix, verbose = TRUE)

## Plot the results
par(mfrow = c(3, 4))
plot(GVC, ylim = c(-2.5, 2.5))

plot(fitGVC, ylim = c(-2.5, 2.5))

plot(fitGVC2, ylim = c(-2.5, 2.5))

## End(Not run)

gamCopula documentation built on Feb. 6, 2020, 5:12 p.m.