med.regressCOP: Perform Median Regression using a Copula by Numerical...

med.regressCOPR Documentation

Perform Median Regression using a Copula by Numerical Derivative Method for V with respect to U

Description

Perform median regression (Nelsen, 2006, pp. 217–218) of a copula by inversion of numerical derivatives of the copula (derCOPinv). The documentation for qua.regressCOP provides mathematical details. The qua.regressCOP.draw supports so-called quantile regression along various probability levels (see Examples).

Usage

med.regressCOP(u=seq(0.01,0.99, by=0.01), cop=NULL, para=NULL, level=NA, ...)

Arguments

u

Nonexceedance probability u in the X direction;

cop

A copula function;

para

Vector of parameters or other data structure, if needed, to pass to the copula;

level

The level of the prediction interval to compute. For example, level=0.95 will compute the 95-percent prediction interval as will level=0.05 because internally a reflection check is made; and

...

Additional arguments to pass such qua.regressCOP and derCOPinv that are called in succession.

Value

An R data.frame of the median regressed probabilities of V and provided U values is returned. Note: if level is used, then the column ordering of the returned data.frame changes—please access the columns by the named idiom. The lower- and upper-prediction interval is contained in the columns repectively titled Vlwr and Vupr to mimic nomenclature somewhat of function predict.lm() in R.

Note

An extended demonstration is needed concerning prediction intervals by median regression and a comparison to well-known linear regression. This also affords and opportunity to have copBasic interact with the copula package to gain access to the Gaussian copula and a maximum pseudo-likelihood estimation of the parameter.

First, a function NORMcop() is defined to form the interconnect between the two packages. It is critically important that the user recognize that the so-called copula object as built by the copula package is treated as the canonical para argument in copBasic calls herein.

  "NORMcop" <-  # pCoupla() from package copula is analogous to COP()
  function(u,v, para=NULL, ...) {
    if(length(u) == 1) u <- rep(u, length(v)) # see asCOP() for reasoning of
    if(length(v) == 1) v <- rep(v, length(u)) # this "vectorization" hack
    return(copula::pCopula(matrix(c(u,v), ncol=2), para))
  }

The parameter \Theta \in [-1, 1] (Pearson R) and \rho_\mathbf{C}(\Theta) (Spearman Rho, rhoCOP) and \tau_\mathbf{C}(\Theta) (Kendall Tau, tauCOP) are according to Salvadori et al. (2007, p. 255) the values

\rho_\mathbf{C}(\Theta) = \frac{2}{\pi}\,\mathrm{arcsin}(\Theta)

and

\tau_\mathbf{C}(\Theta) = \frac{6}{\pi}\,\mathrm{arcsin}(\Theta/2)\mbox{.}

Second, a bivariate Gaussian copula is defined with a parameter \Theta = 0.7 (thus \rho_\mathbf{C} = 0.6829105, rhoCOP(NORMcop, norm.cop)) and then n=255 samples simulated from it. These are then cast into standard normal variates to mimic the idea of bivariate data in nonprobability units and facilitation regression comparison.

  norm.cop <- copula::normalCopula(c(0.7), dim = 2) # define a Gaussian copula
  UVs      <- copula::rCopula(255, norm.cop) # draw 255 samples
  UVs      <- as.data.frame(UVs); X <- qnorm(UVs[,1]); Y <- qnorm(UVs[,2])

Third, the Weibull plotting positions from the pp() function of package lmomco are used to estimate the empirical probababilities of the data in UV that are casted into an R matrix because the copula package expects the data as a matrix for the default parameter estimation. The code is completed by the specification of the fitted Gaussian copula in fnorm.cop.

  UV <- as.matrix(data.frame(U=lmomco::pp(X, sort=FALSE),
                             V=lmomco::pp(Y, sort=FALSE)))
  para <- copula::fitCopula(copula::normalCopula(dim=2), UV)
  para <- summary(para)$coefficients[1] # maximum pseudo-likelihood est.
  fnorm.cop <- copula::normalCopula(para, dim=2)

Fourth, ordinary-least-squares (OLS) linear regression for Y\mid X and X\mid Y is computed, and the results plotted on top of the data points. The 2/3rd-prediction limits are computed by predict.lm() and also shown.

  # Classical linear regressions from two perspectives.
  LMyx <- lm(Y~X);       LMxy <- lm(X~Y)
  YonX <- summary(LMyx); XonY <- summary(LMyx)

  QUorV <- seq(-3,3, by=0.05) # vector for graphical operations
  plot(X, Y, col=8, pch=21)
  lines(QUorV, YonX$coefficients[1]+YonX$coefficients[2]*QUorV, col=2, lwd=4)
  tmp <- predict.lm(LMyx, list(X=X), level=2/3, interval="prediction")
  lines(X, tmp[,2], col=2); lines(X, tmp[,3], col="red"  )

  lines(XonY$coefficients[1]+XonY$coefficients[2]*QUorV, QUorV, col=3, lwd=4)
  tmp <- predict.lm(LMxy, list(Y=Y), level=2/3, interval="prediction")
  lines(tmp[,2], Y, col="green"); lines(tmp[,3], Y, col="green")

Finally, the demonstration ends with plotting of the median regression for the Gaussian copula and drawing the regression lines. The two median regression lines are nearly coincident with the OLS regression lines as anticipated with a reasonably large sample size albeit maximum pseudo-likelihood was used to estimate the copula parameter. The mean of a uniform distributed variable given say U = u (horizontal axis) is 1/2, which coincides with the median. The median regression lines thus are coincident with the OLS lines even though OLS and real-space (native units of X and Y) were not used for their computation. The 2/3-prediction intervals also are plotted for comparison.

  UorV    <- c(0.001, seq(.02,0.98, by=.02), 0.999)
  MEDreg  <- med.regressCOP( u=UorV, level=2/3, cop=NORMcop, para=fnorm.cop)
  MEDreg2 <- med.regressCOP2(v=UorV, level=2/3, cop=NORMcop, para=fnorm.cop)
  lines(qnorm(UorV),         qnorm(MEDreg$V),    col="blue",    lty=2)
  lines(qnorm(UorV),         qnorm(MEDreg$Vlwr), col="blue",    lty=2)
  lines(qnorm(UorV),         qnorm(MEDreg$Vupr), col="blue",    lty=2)
  lines(qnorm(MEDreg2$U),    qnorm(UorV),        col="magenta", lty=2)
  lines(qnorm(MEDreg2$Ulwr), qnorm(UorV),        col="magenta", lty=2)
  lines(qnorm(MEDreg2$Uupr), qnorm(UorV),        col="magenta", lty=2)

A curious aside (Joe, 2014, p. 164) about the Gaussian copula is that Blomqvist Beta (blomCOP) is equal to Kendall Tau (tauCOP), which can be checked by

  blomCOP(cop=NORMcop, para=norm.cop) # 0.4936334
  tauCOP( cop=NORMcop, para=norm.cop) # 0.493633

and obviously the \beta_\mathbf{C} = \tau_\mathbf{C} are numerically the same.

Author(s)

W.H. Asquith

References

Nelsen, R.B., 2006, An introduction to copulas: New York, Springer, 269 p.

Salvadori, G., De Michele, C., Kottegoda, N.T., and Rosso, R., 2007, Extremes in Nature—An approach using copulas: Springer, 289 p.

See Also

med.regressCOP2, qua.regressCOP, qua.regressCOP.draw

Examples

## Not run: 
# INDEPENDENCE YIELDS STRAIGHT LINES, RED IS THE MEDIAN REGRESSION
FF <- seq(0.1, 0.9, by=0.1)
plot(c(0,1),c(0,1), type="n", lwd=3,
     xlab="U, NONEXCEEDANCE PROBABILITY", ylab="V, NONEXCEEDANCE PROBABILITY")
# Draw the regression of V on U and then U on V (wrtV=TRUE)
qua.regressCOP.draw(f=FF, cop=P, para=NA, ploton=FALSE)
qua.regressCOP.draw(f=FF, cop=P, para=NA, ploton=FALSE, wrtV=TRUE, lty=2)#
## End(Not run)

## Not run: 
# NEGATIVE PLACKETT THETA YIELDS CURVES DOWN TO RIGHT, RED IS THE MEDIAN REGRESSION
theta <- 0.5; FF <- seq(0.1, 0.9, by=0.1)
plot(c(0,1),c(0,1), type="n", lwd=3,
     xlab="U, NONEXCEEDANCE PROBABILITY", ylab="V, NONEXCEEDANCE PROBABILITY")
# Draw the regression of V on U and then U on V (wrtV=TRUE)
qua.regressCOP.draw(f=FF, cop=PLACKETTcop, ploton=FALSE, para=theta)
qua.regressCOP.draw(f=FF, cop=PLACKETTcop, ploton=FALSE, para=theta, wrtV=TRUE, lty=2)#
## End(Not run)

wasquith/copBasic documentation built on Dec. 13, 2024, 6:39 p.m.