R/gamVineStructureSelect.R

Defines functions valid.gamVineStructureSelect as.GVC fitAGAMCopula fitACopula intersectDifference wrapper.fitACopula buildNextGraph fitTree fitFirstTree findMST initFirstGraph gamVineStructureSelect

Documented in gamVineStructureSelect

#' Structure selection and estimation of a GAM-Vine model.
#' 
#' This function select the structure and estimates the parameter(s) of a 
#' Generalized Additive model 
#' (GAM) Vine model, where GAMs for individual edges are specified either for
#' the copula parameter or Kendall's tau.
#' It solves the maximum penalized likelihood estimation for the copula families 
#' supported in this package by reformulating each Newton-Raphson iteration as 
#' a generalized ridge regression, which is solved using 
#' the \code{\link[mgcv:mgcv-package]{mgcv}} package.
#'
#' @param udata A matrix or data frame containing the data in [0,1]^d.
#' @param lin.covs A matrix or data frame containing the parametric (i.e., 
#' linear) covariates (default: \code{lin.covs = NULL}).
#' @param smooth.covs A matrix or data frame containing the non-parametric 
#' (i.e., smooth) covariates (default: \code{smooth.covs = NULL}).
#' @param simplified If \code{TRUE} (default), then a simplified vine is fitted 
#' (which is possible only if there are exogenous covariates). If \code{FALSE}, 
#' then a non-simplified vine is fitted.
#' @param type \code{type = 0} (default) for a R-Vine and \code{type = 1} for a
#' C-Vine.
#' @param familyset An integer vector of pair-copula families to select from 
#' (the independence copula MUST NOT be specified in this vector unless one 
#' wants to fit an independence vine!). 
#' Not listed copula families might be included to better handle 
#' limit cases. If \code{familyset = NA} (default), selection among all 
#' possible families is performed. Coding of pair-copula families:  
#' \code{1} Gaussian, 
#' \code{2} Student t, 
#' \code{5} Frank, 
#' \code{301} Double Clayton type I (standard and rotated 90 degrees), 
#' \code{302} Double Clayton type II (standard and rotated 270 degrees), 
#' \code{303} Double Clayton type III (survival and rotated 90 degrees), 
#' \code{304} Double Clayton type IV (survival and rotated 270 degrees), 
#' \code{401} Double Gumbel type I (standard and rotated 90 degrees), 
#' \code{402} Double Gumbel type II (standard and rotated 270 degrees), 
#' \code{403} Double Gumbel type III (survival and rotated 90 degrees), 
#' \code{404} Double Gumbel type IV (survival and rotated 270 degrees).
#' @param rotations If \code{TRUE}, all rotations of the families in familyset 
#' are included.
#' @param familycrit Character indicating the criterion for bivariate copula 
#' selection. Possible choices: \code{familycrit = 'AIC'} (default) or 
#' \code{'BIC'}, as in \code{\link{BiCopSelect}} from the 
#' \code{\link[VineCopula:VineCopula-package]{VineCopula}} package. 
#' @param treecrit Character indicating how pairs are selected in each tree.
#' \code{treecrit = "tau"} uses the maximum spanning tree of the Kendall's tau 
#' (i.e., the tree of maximal overall dependence), \code{treecrit = "rho"} 
#' uses the Spearman's rho.
#' @param level Numerical; Passed to \code{\link{gamBiCopSelect}}, it is the
#' significance level of the test for removing individual
#' predictors (default: \code{level = 0.05}) for each conditional pair-copula. 
#' @param trunclevel Integer; level of truncation.
#' @param tau \code{TRUE} (default) for a calibration function specified for 
#' Kendall's tau or \code{FALSE} for a calibration function specified 
#' for the Copula parameter.
#' @param method \code{'NR'} for Newton-Raphson
#' and  \code{'FS'} for Fisher-scoring (default).
#' @param tol.rel Relative tolerance for \code{'FS'}/\code{'NR'} algorithm.
#' @param n.iters Maximal number of iterations for 
#' \code{'FS'}/\code{'NR'} algorithm.
#' @param parallel \code{TRUE} (default) for parallel selection of copula 
#' family at each edge or \code{FALSE} for the sequential version.
#' for the Copula parameter.
#' @param verbose \code{TRUE} if informations should be printed during the 
#' estimation and \code{FALSE} (default) for a silent version.
#' from \code{\link[mgcv:mgcv-package]{mgcv}}.
#' @param select.once if \code{TRUE} the GAM structure is only selected once,
#'   for the family that appears first in \code{familyset}.
#' @return \code{gamVineSeqFit} returns a \code{\link{gamVine-class}} object.
#' @examples
#' require(VineCopula)
#' set.seed(0)
#' 
#' 
#' ## An example with a 3-dimensional GAM-Vine
#' 
#' # Sample size
#' n <- 1e3
#' 
#' # Define a R-vine tree structure matrix
#' d <- 3
#' Matrix <- c(2,3,1,0,3,1,0,0,1)
#' Matrix <- matrix(Matrix,d,d)
#' nnames <- paste("x", 1:d, sep = "")
#' 
#' # Copula families for each edge
#' fam <- c(301,401,1)
#' 
#' # Parameters for the first tree (two unconditional copulas)
#' par <- c(1,2)
#' 
#' # Pre-allocate the GAM-Vine model list
#' count <- 1
#' model <- vector(mode = "list", length = d*(d-1)/2)
#' 
#' # The first tree contains only the two unconditional copulas
#' for (i in 1:(d-1)) {
#'   model[[count]] <- list(family = fam[count], par = par[count], par2 = 0)
#'   count <- count + 1
#' }
#' 
#' # The second tree contains a unique conditional copula
#' # In this first example, we take a linear calibration function (10*x-5)
#' 
#' # Set-up a dummy dataset
#' tmp <- data.frame(u1 = runif(1e2), u2 = runif(1e2), x1 = runif(1e2))
#' 
#' # Set-up an arbitrary linear model for the calibration function
#' model[[count]] <- gamBiCopFit(tmp, ~ x1, fam[count])$res
#' 
#' # Update the coefficients of the model
#' attr(model[[count]],"model")$coefficients <- c(-5, 10)
#' 
#' # Define gamVine object
#' GVC <- gamVine(Matrix = Matrix, model = model, names = nnames)
#' GVC
#' 
#' \dontrun{
#' # Simulate new data
#' simData <- data.frame(gamVineSimulate(n, GVC))
#' colnames(simData) <- nnames
#' 
#' # Fit data using sequential estimation assuming true model known
#' summary(fitGVC <- gamVineSeqFit(simData, GVC))
#' 
#' # Fit data using structure selection and sequential estimation
#' summary(fitGVC2 <- gamVineStructureSelect(simData, simplified = FALSE))}
#' 
#' @seealso  \code{\link{gamVineSeqFit}},\code{\link{gamVineCopSelect}}, 
#'  \code{\link{gamVine-class}}, \code{\link{gamVineSimulate}} and 
#'  \code{\link{gamBiCopSelect}}.
gamVineStructureSelect <- function(udata, lin.covs = NULL, smooth.covs = NULL,
                                   simplified = TRUE, type = 0, familyset = NA, 
                                   rotations = TRUE, familycrit = "AIC", 
                                   treecrit = "tau", level = 0.05, 
                                   trunclevel = NA, tau = TRUE, method = "FS",
                                   tol.rel = 0.001, n.iters = 10,
                                   parallel = FALSE, verbose = FALSE,
                                   select.once = TRUE) {
  
  tmp <- valid.gamVineStructureSelect(udata, lin.covs, smooth.covs, simplified, 
                                      type, familyset, rotations, familycrit,
                                      treecrit, level, trunclevel,
                                      tau, method, tol.rel, n.iters,
                                      parallel, verbose, select.once)
  if (tmp != TRUE) {
    stop(tmp)
  }
  
  ## Transform to dataframe, get dimensions, etc (see in utilsPrivate)
  tmp <- prepare.data2(udata, lin.covs, smooth.covs, 
                       trunclevel, familyset, rotations)
  n <- tmp$n
  d <- tmp$d
  l <- tmp$l
  nn <- tmp$nn
  data <- tmp$data
  covariates <- tmp$covariates
  trunclevel <- tmp$trunclevel
  familyset <- tmp$familyset
  
  if (type == 0) {
    type <- "RVine"
  } else {
    type <- "CVine"
  } 
  
  out <- list(Tree = vector("list", d - 1), Graph = vector("list", d - 1))
  #browser()
  if (is.null(colnames(udata)))
    colnames(udata) <- paste0("V", 1:ncol(udata))
  graph <- initFirstGraph(udata, treecrit)
  mst <- findMST(graph, type)
  tree <- fitFirstTree(mst, udata,  lin.covs, smooth.covs, l, covariates,
                       familyset, familycrit,  treecrit, level,
                       tau, method, tol.rel, n.iters, parallel,  verbose)
  out$Tree[[1]] <- tree
  out$Graph[[1]] <- graph
  
  oldtree  <- tree
  for(i in 2:(d-1)){
    if(trunclevel == i-1) {
      familyset <- 0
    }
    graph <- buildNextGraph(tree, udata, l, lin.covs, smooth.covs,
                            simplified, treecrit)
    mst <- findMST(graph, type)
    tree <- fitTree(mst, tree, udata, l, lin.covs, smooth.covs, simplified,
                    familyset, familycrit, treecrit, level,
                    tau, method, tol.rel, n.iters, parallel, verbose)
    
    out$Tree[[i]] <- tree
    out$Graph[[i]] <- graph
  }
  
  return(as.GVC(out,covariates))
}

initFirstGraph <- function(data, treecrit) {
  
  if (treecrit == "tau")
  {
    C <- TauMatrix(data)
  }
  if (treecrit == "rho")
  {
    C <- cor(data, method = "spearman")
  }
  
  rownames(C) <- colnames(C) <- colnames(data)
  
  g <- graph.adjacency(C, mode="lower", weighted=TRUE, diag=FALSE)
  E(g)$tau <- E(g)$weight
  E(g)$weight <- 1-abs(E(g)$weight)
  E(g)$name <- paste(get.edgelist(g)[,1], get.edgelist(g)[,2], sep=",")  
  for(i in 1:ecount(g)){
    E(g)$conditionedSet[[i]] <- ends(g,i,FALSE)
  }
  
  return(g)
}

findMST <- function(g, mode = "RVine") {
  
  if (mode == "RVine") {
    return(minimum.spanning.tree(g, weights=E(g)$weight))
  } else {
    M <- abs(get.adjacency(g, attr="weight", sparse=0))
    root <- which.min(rowSums(M))    
    Ecken <- ends(g,1:ecount(g),FALSE)
    pos <- Ecken[,2] == root | Ecken[,1] == root
    return(delete.edges(g, E(g)[!pos]))
  }
}

fitFirstTree <- function(mst, udata, lin.covs, smooth.covs, l, covariates, 
                         familyset, familycrit, treecrit, level, 
                         tau, method, tol.rel, n.iters, parallel, verbose) {
  
  d <- ecount(mst)
  
  parameterForACopula <- vector("list", d)
  
  for(i in 1:d) {
    
    a <- ends(mst,i,FALSE)
    
    if (is.null(V(mst)[a[1]]$name)) {
      E(mst)[i]$CondName1 <- a[1]
    } else {
      E(mst)[i]$CondName1 <- V(mst)[a[1]]$name
    }
    
    if(is.null(V(mst)[a[2]]$name)) {
      E(mst)[i]$CondName2 <- a[2]
    } else {
      E(mst)[i]$CondName2 <- V(mst)[a[2]]$name
    }
    
    if(is.null(V(mst)[a[1]]$name) || is.null(V(mst)[a[2]]$name)) {
      E(mst)[i]$Name <- paste(a[1], a[2], sep=" , ")
    } else {
      E(mst)[i]$Name <- paste(V(mst)[a[1]]$name, 
                              V(mst)[a[2]]$name, sep=" , ")
    } 
    
    if (l == 0) {
      parameterForACopula[[i]] <- list()
      parameterForACopula[[i]]$zr1 <- udata[,a[1]]
      parameterForACopula[[i]]$zr2 <- udata[,a[2]]
    } else {
      parameterForACopula[[i]] <- list(cbind(as.numeric(udata[,a[1]]), 
                                             as.numeric(udata[,a[2]])),
                                       lin.covs,
                                       smooth.covs)
    }
  }
  
  if (l == 0) {
    outForACopula <- lapply(parameterForACopula, wrapper.fitACopula, 
                            familyset, familycrit, level)
  } else {
    outForACopula <- lapply(parameterForACopula, wrapper.fitACopula, 
                            familyset, familycrit, level,
                            tau, method, tol.rel, n.iters, parallel)
  }
  for(i in 1:d) {
    E(mst)[i]$model <- list(outForACopula[[i]]$model)
    E(mst)[i]$CondData1 <- list(outForACopula[[i]]$CondOn1)
    E(mst)[i]$CondData2 <- list(outForACopula[[i]]$CondOn2)
  }
  
  return(mst)
}

fitTree <- function(mst, oldVineGraph, udata, l, lin.covs, smooth.covs,
                    simplified, familyset, familycrit, treecrit, level,
                    tau, method, tol.rel, n.iters, parallel, verbose) {
  
  d <- ecount(mst)
  
  parameterForACopula <- list()
  
  for(i in 1:d) {
    con <- ends(mst,i,FALSE)
    tmp <- ends(oldVineGraph,con,FALSE)
    
    if ((tmp[1,1] == tmp[2,1])|| (tmp[1,2] == tmp[2,1])) {
      same <- tmp[2,1]
    } else {
      if ((tmp[1,1] == tmp[2,2]) || (tmp[1,2] == tmp[2,2])) {
        same <- tmp[2,2]
      }
    }
    
    other1 <- tmp[1,tmp[1,] != same]
    other2 <- tmp[2,tmp[2,] != same]
    
    if(tmp[1,1] == same){
      zr1 <- E(oldVineGraph)[con[1]]$CondData2
      n1 <- E(oldVineGraph)[con[1]]$CondName2
    }else{
      zr1 <- E(oldVineGraph)[con[1]]$CondData1
      n1 <- E(oldVineGraph)[con[1]]$CondName1
    }
    
    if(tmp[2,1] == same){
      zr2 <- E(oldVineGraph)[con[2]]$CondData2
      n2 <- E(oldVineGraph)[con[2]]$CondName2
    }else{
      zr2 <- E(oldVineGraph)[con[2]]$CondData1
      n2 <- E(oldVineGraph)[con[2]]$CondName1
    }
    
    if(is.list(zr1)){
      zr1a <- as.vector(zr1[[1]])
      zr2a <- as.vector(zr2[[1]])
      n1a <- as.vector(n1[[1]])
      n2a <- as.vector(n2[[1]])
    }
    else{
      zr1a <- zr1
      zr2a <- zr2
      n1a <- n1
      n2a <- n2
    }
    
    if(verbose == TRUE) message(n1a," + ",n2a," --> ", E(mst)[i]$name)
    
    E(mst)[i]$CondName2 <- n1a
    E(mst)[i]$CondName1 <- n2a
    
    tmp <- E(mst)$conditioningSet[[i]]
    if (simplified) {
      if (l == 0) {
        parameterForACopula[[i]] <- list()
        parameterForACopula[[i]]$zr1 <- as.numeric(zr1a)
        parameterForACopula[[i]]$zr2 <- as.numeric(zr2a)
      } else {
        parameterForACopula[[i]] <- list(cbind(as.numeric(zr1a), 
                                               as.numeric(zr2a)),
                                         lin.covs,
                                         smooth.covs)
      }
    } else {
      temp <- as.data.frame(do.call(cbind,
                                    list(udata[,tmp],smooth.covs)))
      names(temp) <- c(colnames(udata)[tmp], colnames(smooth.covs))
      parameterForACopula[[i]] <- list(cbind(as.numeric(zr1a), 
                                             as.numeric(zr2a)),
                                       lin.covs, temp)
    }
  }
  
  outForACopula <- lapply(parameterForACopula, wrapper.fitACopula, 
                          familyset, familycrit, level,
                          tau, method, tol.rel, n.iters, parallel)
  
  for (i in 1:d) {
    E(mst)[i]$model <- list(outForACopula[[i]]$model)
    E(mst)[i]$CondData2 <- list(outForACopula[[i]]$CondOn2)
    E(mst)[i]$CondData1 <- list(outForACopula[[i]]$CondOn1)
  }
  
  return(mst)
}	

buildNextGraph <- function(graph, udata,  l, lin.covs, smooth.covs, simplified, 
                           treecrit) {
  
  EL <- get.edgelist(graph)
  d <- ecount(graph)
  
  g <- graph.full(d)
  V(g)$name <- E(graph)$name
  V(g)$conditionedSet <- E(graph)$conditionedSet
  
  if (!is.null(E(graph)$conditioningSet)) {
    V(g)$conditioningSet <- E(graph)$conditioningSet
  }
  
  for(i in 1:ecount(g)){
    
    con <- ends(g, i, FALSE)
    tmp <- ends(graph, con, FALSE)
    
    ok <- FALSE    
    if ((tmp[1,1] == tmp[2,1])|| (tmp[1,2] == tmp[2,1])) {
      ok <- TRUE
      same <- tmp[2,1]
    } else {
      if ((tmp[1,1] == tmp[2,2]) || (tmp[1,2] == tmp[2,2])) {
        ok <- TRUE
        same <- tmp[2,2] 
      }
    }
    
    if(ok){
      other1 <- tmp[1,tmp[1,] != same]
      other2 <- tmp[2,tmp[2,] != same]
      
      if (tmp[1,1] == same) {
        zr1 <- E(graph)[con[1]]$CondData2
      } else {
        zr1 <- E(graph)[con[1]]$CondData1
      }
      
      if (tmp[2,1] == same) {
        zr2 <- E(graph)[con[2]]$CondData2
      } else {
        zr2 <- E(graph)[con[2]]$CondData1
      }
      
      if (is.list(zr1)) {
        zr1a <- as.vector(zr1[[1]])
        zr2a <- as.vector(zr2[[1]])
      } else{
        zr1a <- zr1
        zr2a <- zr2
      }
      noNAs <- !(is.na(zr1a) | is.na(zr2a))
      
      name.node1 <- strsplit(V(g)[con[1]]$name,split=" *[,|] *")[[1]]
      name.node2 <- strsplit(V(g)[con[2]]$name,split=" *[,|] *")[[1]]   
      intersection <- c()
      
      for (j in 1:length(name.node1)) {
        for (k in 1:length(name.node2)) {
          if (name.node1[j] == name.node2[k]) {
            intersection <- c(intersection,name.node1[j])
            name.node1[j] <- ""
            name.node2[k] <- ""
            break
          }
        }
      }
      
      difference <- c()
      for (j in 1:length(name.node1)) {
        if (name.node1[j] != "") {
          difference <- c(difference, name.node1[j])
        }
      }
      
      for (j in 1:length(name.node2)) {
        if(name.node2[j] != "") {
          difference <- c(difference, name.node2[j])
        }
      }
      
      E(g)[i]$name <- paste(paste(difference, collapse= ","),
                            paste(intersection, collapse= ","), sep= " | ")
      
      if (is.list(V(g)[con[1]]$conditionedSet)) {
        l1 <- c(as.vector(V(g)[con[1]]$conditionedSet[[1]]),
                as.vector(V(g)[con[1]]$conditioningSet[[1]]))
        l2 <- c(as.vector(V(g)[con[2]]$conditionedSet[[1]]),
                as.vector(V(g)[con[2]]$conditioningSet[[1]]))
      } else {
        l1 <- c(V(g)[con[1]]$conditionedSet,V(g)[con[1]]$conditioningSet)
        l2 <- c(V(g)[con[2]]$conditionedSet,V(g)[con[2]]$conditioningSet)
      }
      out <- intersectDifference(l1,l2)
      
      suppressWarnings({E(g)$conditionedSet[i] <- list(out$difference)})
      suppressWarnings({E(g)$conditioningSet[i]  <- list(out$intersection)})
      
      if (treecrit == "tau") {
        E(g)[i]$weight <- 1-abs(fasttau(zr1a[noNAs], zr2a[noNAs]))
      } else if (treecrit == "rho") {
        E(g)[i]$weight <- 1-abs(cor(zr1a[noNAs], zr2a[noNAs], method = "spearman"))
      }
    }
    
    E(g)[i]$todel <- !ok
  }
  
  E(g)$tau <- E(g)$weight
  g <- delete.edges(g, E(g)[E(g)$todel])
  
  return(g)
}

wrapper.fitACopula <- function(parameterForACopula,...) {
  if (length(parameterForACopula) == 2) {
    return(fitACopula(parameterForACopula$zr1,parameterForACopula$zr2,...))
  } else {
    return(fitAGAMCopula(parameterForACopula,...))
  } 
}

intersectDifference <- function(liste1, liste2) {
  out <- list()	
  out$intersection <- c()
  out$difference <- c()
  
  for(j in 1:length(liste1)){
    for(k in 1:length(liste2)){
      if(!is.na(liste2[k]) && liste1[j] == liste2[k]){
        out$intersection <- c(out$intersection, liste1[j])
        liste1[j] <- NA
        liste2[k] <- NA
        break
      }
    }
  }
  
  for(j in 1:length(liste1)){
    if(!is.na(liste1[j])){
      out$difference <- c(out$difference, liste1[j])
    }
  }
  for(j in 1:length(liste2)){
    if(!is.na(liste2[j])){
      out$difference <- c(out$difference, liste2[j])
    }
  }
  
  return(out)
}

fitACopula <- function(u1, u2, familyset=NA, familycrit="AIC", 
                       level = 0.05, rotation = TRUE) {
  
  ## transform the familyset to codes for the VineCopula package
  fams <- famTrans(familyset, inv = FALSE, set = TRUE)
  
  ## select family and estimate parameter(s) for the pair copula
  out <- BiCopSelect(u1, u2,
                     fams,
                     familycrit,
                     indeptest = TRUE,
                     level = level,
                     rotations = FALSE)
  fam <- famTrans(out$family, inv = TRUE, 
                  par = cor(u1,u2), familyset = familyset)
  if (rotation == TRUE) {
    if (fam %in% c(301,303,401,403)) {
      fam <- fam+1
    } else if (fam %in% c(302,304,402,404)) {
      fam <- fam-1
    } 
  }
  out$family <- fam
  
  ## store pseudo-observations for estimation in next tree
  tmp <- bicoppd1d2(cbind(u1,u2,out$par,out$par2), out$family, p=FALSE, h=TRUE)
  
  ## save the model and pseudo-observations
  model <- list(family = out$family, par = out$par, par2 = out$par2)
  out <- list(model = model, CondOn1 = tmp[2,], CondOn2 = tmp[1,])
  
  return(out)
}

fitAGAMCopula <- function(data, familyset, familycrit, level, tau, 
                          method, tol.rel, n.iters, parallel, rotation = TRUE,
                          select.once = TRUE) {
  out <- list()
  u1 <- data[[1]][,1]
  u2 <- data[[1]][,2]
  udata <- cbind(u1, u2)
  lin.covs <- data[[2]]
  smooth.covs <- data[[3]]
  data <- do.call(cbind, data[!(sapply(data, is.null))])
  
  if (length(familyset) == 1 && familyset == 0) {
    ## independence copula
    
    out$model <- list(family = 0, par = 0, par2 = 0)    
    par <- rep(0, length(u1))
    fam <- 0
    par2 <- 0
  } else {
    ## transform the familyset to codes for the VineCopula package
    fams <- famTrans(familyset, inv = FALSE, set = TRUE)
    
    ## conditional copula
    tmp <- gamBiCopSelect(udata, lin.covs, smooth.covs,
                          familyset, FALSE, familycrit, level, 1.5, tau,
                          method, tol.rel, n.iters, parallel, 
                          select.once = select.once)
    if (!is.character(tmp)) {
      nvar <- unique(all.vars(tmp$res@model$pred.formula))
    }
    
    if (!is.character(tmp) && length(nvar) > 0) {
      out$model <- tmp$res
      par <- gamBiCopPredict(out$model, target = "par")$par
      fam <- out$model@family
      par2 <- out$model@par2
    } else {
      tmp <- BiCopSelect(u1, u2, fams, selectioncrit = familycrit,
                         indeptest = TRUE, level = level, rotations = FALSE)
      par <- rep(tmp$par, length(u1))
      out$model$par <- tmp$par
      out$model$family <- fam <- tmp$family
      out$model$par2 <- par2 <- tmp$par2
    }
    fam <- famTrans(fam, inv = TRUE, par = cor(u1,u2), familyset = familyset)
    if (rotation == TRUE) {
      if (fam %in% c(301,303,401,403)) {
        fam <- fam+1
      } else if (fam %in% c(302,304,402,404)) {
        fam <- fam-1
      } 
    }
    if (isS4(out$model)) {
      attr(out$model, "family") <- fam
    } else {
      out$model$family <- fam
    }
  }
  
  ## store pseudo-observations for estimation in next tree
  fams <- vapply(1:length(par), function(j) 
    famTrans(fam, inv = FALSE, par = par[j]), numeric(1))
  out$CondOn1 <- BiCopHfunc(u1, u2, fams, par, par2, check.pars = FALSE)$hfunc2
  out$CondOn2 <- BiCopHfunc(u1, u2, fams, par, par2, check.pars = FALSE)$hfunc1
  
  return(out)
}

as.GVC <- function(GVC, covariates){
  n <- length(GVC$Tree)+1
  con <- list()
  nam <- V(GVC$Tree[[1]])$name
  
  conditionedSets <- NULL
  correspondingModel <- NULL
  
  if (is.list(E(GVC$Tree[[n-1]])$conditionedSet)) {
    conditionedSets[[n-1]][[1]] <- (E(GVC$Tree[[n-1]])$conditionedSet[[1]])  
  }
  else{
    conditionedSets[[n-1]][[1]] <- (E(GVC$Tree[[n-1]])$conditionedSet)
  }
  
  for (k in 1:(n-2)) {
    conditionedSets[[k]] <- E(GVC$Tree[[k]])$conditionedSet
    correspondingModel[[k]] <- as.list(E(GVC$Tree[[k]])$model)
  }
  
  correspondingModel[[n-1]] <- E(GVC$Tree[[n-1]])$model
  
  model.count <- get.modelCount(n)  
  model <- vector("list", n*(n-1)/2)
  M <- matrix(NA,n,n)
  
  for(k in 1:(n-1)){
    w <- conditionedSets[[n-k]][[1]][1]
    
    M[k,k] <- w
    M[(k+1),k] <- conditionedSets[[n-k]][[1]][2]
    
    model[[model.count[(k+1),k]]] <- correspondingModel[[n-k]][[1]]
    if(k == (n-1)){
      M[(k+1),(k+1)] <- conditionedSets[[n-k]][[1]][2]
    } else {
      for(i in (k+2):n){
        for(j in 1:length(conditionedSets[[n-i+1]])){
          cs <- conditionedSets[[n-i+1]][[j]]
          tmp <- correspondingModel[[n-i+1]][[j]]
          if(cs[1] == w){
            M[i,k] <- cs[2]
            model[[model.count[i,k]]] <- tmp
            break
          } else if(cs[2] == w){
            M[i,k] = cs[1]
            if (isS4(tmp)) {
              fam <- attr(tmp, "family")
            } else {
              fam <- tmp$family
            }
            if (fam %in% c(301,303,401,403)) {
              fam <- fam+1
            } else if (fam %in% c(302,304,402,404)) {
              fam <- fam-1
            }
            
            if (isS4(tmp)) {
              attr(tmp, "family") <- fam
            } else {
              tmp$family <- fam
            }
            model[[model.count[i,k]]] <- tmp
            break
          }
        }
        
        conditionedSets[[n-i+1]][[j]] <- NULL
        correspondingModel[[n-i+1]][[j]] <- NULL
      }
    }
  }
  M[is.na(M)] <- 0
  return(gamVine(M, model, nam, covariates))
  
}

valid.gamVineStructureSelect <- function(data, lin.covs, smooth.covs, 
                                         simplified, type,
                                         familyset, rotations, familycrit, 
                                         treecrit, level, trunclevel, 
                                         tau, method, tol.rel, n.iters, 
                                         parallel, verbose, select.once) {  
  
  if (!is.matrix(data) && !is.data.frame(data)) {
    return("data has to be either a matrix or a data frame")
  } 
  tmp <- valid.gamBiCopSelect(data[,1:2], lin.covs, smooth.covs, rotations, 
                              familyset, familycrit, level, 2, tau, method, 
                              tol.rel, n.iters, parallel, verbose, select.once)
  if (tmp != TRUE)
    return(tmp)
  
  n <- dim(data)[1]
  d <- dim(data)[2]
  
  if (d < 2) {
    return("Number of dimensions has to be at least 2.")
  }
  if (n < 2) {
    return("Number of observations has to be at least 2.")
  }
  if (any(data[,1:d] > 1) || any(data[,1:d] < 0)) {
    return("Data has be in the interval [0,1].")
  }
  
  if (!valid.logical(simplified)) {
    return(msg.logical(var2char(simplified)))
  }
  
  if (is.null(lin.covs) && is.null(smooth.covs) && simplified == TRUE) {
    return("When there are no covariates, the vine can't be simplified.")
  }
  
  if (is.null(type) || length(type) != 1 || !is.element(type,c(0,1))) {
    return("Vine model not implemented.")
  }
  
  if(length(treecrit) != 1 || (treecrit != "tau" && treecrit != "rho")) {
    return("Selection criterion for the pair selection not implemented.")
  } 
  
  if (!is.na(trunclevel) && !valid.posint(trunclevel)) {
    return("'trunclevel' should be a positive integer or NA.")
  }
  
  if (!is.logical(select.once)) {
    return("'select.once' must be logical")
  }
  
  return(TRUE)
}

Try the gamCopula package in your browser

Any scripts or data that you put into this service are public.

gamCopula documentation built on Aug. 18, 2017, 1:01 a.m.