cvreg: Cross Validation and Robust Estimation Utilities

Documented in aad bci centerKernel colMads colMedians cor2cov dimbox disr dist2kern hdquantile kern2dist lmSolve make_colors match.closest materndot matpower mpd orthCompl pbLocScale pdcheck qb saferoot sigdot spheredot subspace.dist svdL1 symm tauLocScale weighted.quantile wsvd

#' Generate 'n' colors
#'
#' @description Generates colors according to a range of HCL values I like.
#'
#' @param n number of colors to generate
#'
#' @return a character vector of hex color codes
#' @export
make_colors <- function(n) {
  hues = seq(4, 320, length = n + 1)
  colorspace::darken(hcl(h = hues, l = 60, c = 120)[1:n], 0.10, space = "HCL")
}


#' Average Absolute Deviation
#'
#' @description The average absolute deviation (AAD) from the median is an alternative
#' to the median absolute deviation (MAD). The AAD is in fact the scale parameter for
#' a double exponential (Laplacian) distribution in just the same way that the standard
#' deviation is the scale parameter for the Gaussian distribution. An advantage that
#' the AAD does not break down when 50\% of the values are the same like the MAD.
#' Like the MAD, the AAD must be scaled by a constant to serve as an estimator of the
#' standard deviation. Here, the product of the raw AAD and the the reciprocal of sqrt(2/pi),
#' or 1.253314, yields a consistent estimator of the standard deviation.
#'
#' @param x a numeric vector
#' @param constant the constant scaling factor. defaults to 1.253314.
#' @param center a function to use for estimating the center, from which the average absolute
#' deviations are measured. a number may also be provided. the default is to use the
#' Harrell-Davis estimate of the median.
#' @return a number
#' @export
#'
aad <- function(x, constant = 1.253314, center = hdmedian){
  if (is.function(center)){
    center <- center(x)
  }
  else if (is.numeric(center)){
    center <- center;
    brightred <- crayon::make_style(rgb(0.961, 0.259, 0.40))
    if (length(center) > 1) return(cat(brightred("Please provide a single number.")))
  }
  x <- na.omit(x)
  adev <- mean(abs(x - center))
  return(adev * constant)
}

#' Robust tau-estimate of location and scale
#'
#' @description This implements Yohai & Zamar's tau-estimate of location and scale. The tau-estimate is
#' the result of a three stage estimation process. First, initial estimates of location and scale are made.
#' In this implementation the initial location and scale estimates \eqn{\mu0} and \eqn{\sigma0} are the
#' Harrell-Davis estimates of the median and median absolute deviation. The second stage uses a weight
#' function corresponding to a bounded \eqn{\psi} function, tuned with tuning constant \emph{c1} for
#' robustness in order to provide a robust estimate of location, \eqn{\mu}. The second stage uses a \eqn{\rho}
#' function with tuning constant \emph{c2} tuned for high efficiency to estimate the dispersion of the data
#' around \eqn{\mu}. The final scale estimate is adjusted with a consistency factor to ensure nominal efficiency
#' at the Gaussian distribution. The default tuning constants are taken from Maronna & Zamar (2002), which yield
#' 80\% efficiency at both the Gaussian and Cauchy distributions, and provides a good balance between robustness and
#' efficiency. The underlying implementation is a fast C++ algorithm. See details for more information.
#'
#' @details Weights are then defined using the bisquare weight function as \eqn{w = (1 - (u / c1)^2)^2 \times I(|u| <= c1)},
#' where \eqn{u = (x_i - \mu0)/ \sigma0}. The tau-estimate of location is then defined as \eqn{\mu = (\Sigma (x_i \times w_i)) / \Sigma w_i}
#' and the scale estimate as \eqn{\sigma = \sqrt{\sigma0^2 /n \times \Sigma \rho_c2((x_i - \mu)/ \sigma0)}} , where \eqn{\rho_c2}
#' is the bisquare \eqn{\rho} function with tuning constant c2.
#'
#'
#' @param x a vector
#' @param c1 a tuning parameter. defaults to 4.5.
#' @param c2 a second tuning parameter. defaults to 3.
#' @param mu whether or not the robust mean should also be returned. defaults to TRUE.
#'
#' @return a numeric vector or numeric value
#' @export
#'
#' @examples
#' tauLocScale(iris[,1])
#' @references
#' Yohai, R.A. and Zamar, R.H. (1998) "High breakdown point estimates of regression by means of the minimization of efficient scale. Journal of the American Statistical Association, 86:403–413. \cr \cr
#' Maronna, R.A. and Zamar, R.H. (2002) Robust estimates of location and dispersion of high-dimensional datasets. Technometrics 44(4), 307-317.
#'
tauLocScale <- function(x, c1 = 4.5, c2 = 3, mu = TRUE){
  "scale" <- cvreg:::tau_scale(x, c1, c2)
  if(mu){
    "location" <- cvreg:::tau_loc(x, c1, c2)
    return(c("location" = location, "scale" = scale))
  }
  else{
    return("scale" = scale)
  }
}

#' Robust percentage bend estimate of location and scale
#'
#' @description This implements Shoemaker and Hettmansperger's percentage bend estimate of location
#' and scale. The underlying implementation is a fast C++ algorithm.
#'
#' @param x a vector
#' @param bend a tuning parameter. defaults to 0.15. maximum allowed value is 0.50.
#' @param mu whether or not the robust mean should also be returned. defaults to TRUE.
#'
#' @return a numeric vector or numeric value
#' @export
#'
#' @examples
#' pbLocScale(iris[,1])
#' @references
#' Shoemaker, L. H., & Hettmansperger, T. P. (1982). Robust estimates and tests for the one- and two-sample scale models. Biometrika, 69:47–54 \cr \cr

pbLocScale <- function(x, bend = 0.15, mu = TRUE){
  "scale" <- cvreg:::pb_scale(x, bend)
  if(mu){
    "location" <- cvreg:::pb_loc(x, bend)
    return(c("location" = location, "scale" = scale))
  }
  else{
    return("scale" = scale)
  }
}


#' Column Medians
#'
#' @description calculates column medians analagous to the colMeans() function.
#' @param x a data frame or matrix
#' @param na.omit should rows with missing observations be removed? defaults to TRUE.
#'
#' @return a vector of column medians
#' @export
#'
#' @examples
#' colMedians(x)
#'
colMedians = function(x, na.omit = TRUE){
  if (na.omit) x <- na.omit(x)
  if (is.data.frame(x)) x <- as.matrix(x)
  as.vector(cvreg:::colMediansCpp(x))
}

#' Column Median Absolute Deviations
#'
#' @description calculates column median absolute deviations analagous to the colVars() function in the timeseries package.
#' @param x a data frame or matrix
#' @param na.omit should rows with missing observations be removed? defaults to TRUE.
#'
#' @return a vector of column median absolute deviations
#' @export
#'
#' @examples
#' colMedians(x)
#'
colMads = function(x, na.omit = TRUE){
  if (na.omit) x <- na.omit(x)
  if (is.data.frame(x)) x <- as.matrix(x)
  apply(x,2,mad)
}

#' Rebuild a covariance matrix from a correlation matrix and variances
#'
#' @param Rho a correlation matrix
#' @param var a vector of variances
#'
#' @return a covariance matrix
#' @export
#'
cor2cov <- function(Rho, var = NULL){
  if (is.null(var)) stop("supply variances!")
  if (ncol(Rho) != nrow(Rho)) stop("'Rho' is not a correlation matrix!")
  if (length(var) != ncol(Rho)) stop("length of 'var' and dimension of 'Rho' are not equal!")
  if (any(!is.finite(var))) stop("'var' had 0 or NA entries; result is doubtful!")
  d <- sqrt(var)
  Sigma <- outer(d, d) * Rho
  return(Sigma)
}

#' Convert a covariance or correlation matrix to partial correlation matrix
#'
#' @param x a correlation matrix or a covariance matrix. either yields the same results.
#' @param tol tolerance for near-zero eigenvalues in the matrix inversion. values lower than this are considered zero.
#'
#' @return a partial correlation matrix
#' @export
#'
cov2pcor <- function (x, tol=1e-6) {
  if (!is.null(colnames(x))){names <- colnames(x)}else{names<-NULL}
  x = -pseudoinverse(x, tol = tol)
  diag(x) = -diag(x)
  if (!is.null(names)){colnames(x) <- rownames(x) <- names}
  return(cov2cor(x))
}

#' Raise a symmetric matrix to the pth power
#'
#' @param X a matrix
#' @param p a power. defaults to -1 for an inverse.
#'
#' @return a matrix
#' @export
#'
#' @examples
#' # gives the square root of the covariance matrix
#' matpower(cov(iris[,1:4]), 0.50)
#'
matpower = function(X, p = -1){
  if(isFALSE(ncol(X)==nrow(X))){return(cat(crayon::red("The matrix must be square!!")))}
  X <- as.matrix(X)
  cn <- colnames(X); rn <- rownames(X)
  X <- cvreg:::matpowercpp(signif((X + t(X))/2, 8), p)
  colnames(X) <- cn; rownames(X) <- cn;
  return(X)
}

#' Return the closest value within a vector to a given input
#'
#' @param x the input to be matched
#' @param v the vector of reference numbers
#'
#' @return a number
#' @export
#'
#' @examples
#' v = c(1, 2, 4, 6)
#' match.closest(3.53424, v)
#'
match.closest <- function(x, v){
  v[which.min(abs(x-v))]
}


#' Calculate Mahalanobis distances (tolerant to non positive definite matrices)
#'
#' @description the difference between this and the base function mahalanobis is that this can tolerate
#' a non positive-semidefinite covariance matrix.
#'
#' @param x a matrix of numeric data
#' @param center defaults to column medians but a vector of means or other location measures can be supplied.
#' @param cov either a covariance matrix or a function to calculate one. defaults to an adaptive huber method.
#' @param inverted defaults to FALSE. change to true if you are providing a precision matrix.
#' @param tol tolerance
#'
#' @return a vector
#' @export
#'
#' @examples
#' mahalanobis_dist(x)
mahalanobis_dist = function (x, center = colMeans, cov = function(x) crossprod(x)/nrow(x), inverted = FALSE, tol = 1e-12) {

  x <- as.matrix(x)
  if (is.function(cov))cov <- cov(as.matrix(x))
  if (is.function(center)) center <- center(x)
  if (!isFALSE(center)) x <- sweep(x, 2L, center)
   if (!inverted)
    return(setNames(rowSums(x %*% pseudoinverse(cov, tol)* x), rownames(x)))
  if (inverted)
    return(setNames(rowSums(x %*% cov * x), rownames(x)))

}


#' Minimum Norm Regresion with Moore-Penrose Inverse
#'
#' @description A fast solver for simple least squares models. Returns coefficients only.
#' Solves with Moore-Penrose pseudoinversion which is very fast and tolerant
#' to non-positive definite model matrices.
#'
#' @param formula formula
#' @param data data frame
#' @param w optional weights
#' @param tolerance tolerance
#'
#' @return
#' a vector
#' @export
#'
#' @examples
#' lmSolve()
#'
lmSolve = function(formula, data, w = NULL, tolerance = 1e-9){
  X = as.matrix(model.matrix(formula, data))
  Y = model.frame(formula, data)[,1]
  if (is.null(w)){
    w = rep(1, length(Y))
  }
  betas = as.vector(tcrossprod(pseudoinverse(crossprod(X, (diag(w) %*% X)), tol = tolerance), X) %*% diag(w) %*% Y)
  names(betas) = colnames(X)
  return(betas)
}

#' pseudoinverse
#'
#' @description  Solve for the inverse of a matrix
#'
#' @param X a matrix
#' @param tol tolerance for small singular values. the default is 1e-9
#' @param method whether the divide and conquer ("dc") or standard ("std")
#' algorithm should be used for the SVD.
#'
#' @return
#' a matrix
#' @export
#'
#' @examples
#' pseudoinverse()
#'
pseudoinverse = function (X, tol = 1e-9, method = c("dc", "std")){
  method <- match.arg(method)
  X = as.matrix(X)
  msvd <- try(wsvd(X, tol=tol, method=method),silent = T)
  if (class(msvd)=="try.error"){
    msvd <- try(svd(X), silent=T)
  }
  if (is.complex(X)){msvd$u <- Re(msvd$u)}
  Positive <- msvd$d > max(tol * msvd$d[1L], 0)
  if (all(Positive)) msvd$v %*% (1/msvd$d * t(msvd$u))
  else if (!any(Positive)) {return(array(0, dim(X)[2L:1L]))}
  else {return(msvd$v[, Positive, drop = FALSE] %*% ((1/msvd$d[Positive]) * t(msvd$u[, Positive, drop = FALSE])))}
}


#' Obtain the inverse crossproduct of a matrix
#'
#' @description Returns the pseudoinverse of the crossproduct. Use instead of solve(crossprod(X))
#'
#' @param X the matrix
#' @param W a vector or diagonal matrix of weights
#' @param tol tolerance
#' @param method whether the divide and conquer ("dc") or standard ("std")
#' algorithm should be used for the SVD.
#' @return
#' a matrix
#' @export
#'
#' @examples
#' XtXinv(X)
#'
XtXinv = function (X, W = NULL, tol =  1e-9, method = c("dc", "std")){
  method <- match.arg(method)
  if (is.null(W)){
    W = diag(rep(1, nrow(X)))
  }
  if (is.vector(W)){
    W = diag(W)
  }
  X = as.matrix(X)
  return(pseudoinverse(crossprod(X, W) %*% X, tol = tol, method = method))
}


#' Gram-Schmidt Orthonormal Matrix Decomposition
#'
#' @param X a matrix
#' @param tol tolerance for singularity
#'
#' @return a list of two matrices, Q and R
#' @export
#'
gram.schmidt <- function (X, tol = 1e-100) {
  X <- as.matrix(X)
  lpnorm <- function (x, lp = 2) {
    stopifnot(is.numeric(x) || is.complex(x), is.numeric(lp))
    normfun <- function(x, lp){
      if (lp > -Inf && lp < Inf)
        sum(abs(x)^lp)^(1/lp)
      else if (lp == Inf)
        max(abs(x))
      else if (lp == -Inf)
        min(abs(x))
      else return(NULL)
    }
    if (length(lp) > 1){
      sapply(lp, function(p) normfun(x, p))
    } else {
      normfun(x, lp)
    }
  }

  stopifnot(is.numeric(X), is.matrix(X))
  m <- nrow(X)
  n <- ncol(X)
  if (m < n)
    stop("Number of rows of 'X' must be greater or equal to the number of columns.")
  Q <- matrix(0, m, n)
  R <- matrix(0, n, n)
  for (k in 1:n) {
    Q[, k] <- X[, k]
    if (k > 1) {
      for (i in 1:(k - 1)) {
        R[i, k] <- crossprod(Q[, i], Q[, k])
        Q[, k] <- Q[, k] - R[i, k] * Q[, i]
      }
    }
    R[k, k] <- lpnorm(Q[, k])
    if (abs(R[k, k]) <= tol)
      stop("Matrix 'X' does not have full rank.")
    Q[, k] <- Q[, k]/R[k, k]
  }
  return(list(Q = Q, R = R))
}


#' QB decomposition
#'
#' @description QB decomposition is an approximation algorithm for the
#' QR decomposition. Let the approximation to Q be the approximation to
#' the 'true' Q, and B be a matrix of basis vectors, B=Q'X. The original
#' input matrix X can be approximated using Q and B as X = QB. This is
#' based on the rqb function in the rsvd package, but with a few changes
#' to increase the speed and adaptive setting of default arguments when
#' not specfied by the user.
#'
#' @param X the matrix to be decomposed
#' @param k target rank. if left as NULL the rank is set as the number
#' of columns in X.
#' @param p oversampling factor. higher values improve accuracy, but
#' come at the expense of increased computing time when dealing with
#' very large matrices. if left as NULL a value will be chosen based
#' on the size of the matrix.
#' @param q number of power iterations. higher values reflect a faster
#' decrease in the singluar values of the matrix (ie, a few large values
#' and suddenly much smaller ones). lower values reflect more smoothly
#' decaying singular values. if left as NULL a value will be chosen based
#' on the dimensions of the matrix.
#' @param rand determines whether a randomized algorithm should be used.
#' defaults to TRUE. setting to FALSE increases the speed of computation,
#' but may lead to a decrease in accuracy.
#' @return a list containing named matrices Q and B
#' @export
#' @references
#' Halko, N., Martinsson, P., & Tropp, J. (2011) Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM J. Review., 53(2), 217–288, doi:10.1137/090771806 \cr \cr
#' Martinsson, P. & Voronin, S. (2015) Randomized Blocked Algorithm for Efficiently Computing Rank-revealing Factorizations of Matrices. SIAM J. Sci. Comput., 38(5), 485–507, doi:10.1137/15M1026080
#'
qb <- function(X,k=NULL,p=NULL,q=NULL,rand=TRUE){
  X <- as.matrix(X)
  QB <- list()
  rows <- nrow(X); cols <- ncol(X)
  if(is.null(p)){if(prod(dim(X))>1e5){p=15}else{p=25}};
  if(is.null(q)){if(rows<cols){q=6}else{q=3}}
  if(is.null(k)) k = cols
  if(k > cols) k <- cols
  if(is.character(k)) stop("Target rank is not valid! Please enter a number!")
  if(k < 1) k <- 1
  #Set oversampling parameter
  l <- round(k) + round(p)
  if(l > cols) l <- cols; if(l < 1) stop("Target rank is not valid!")
  isreal <- ifelse(is.complex(X), F, T)
  if(rand == TRUE) {
    O <- matrix(cvreg::rpowexp(l*cols,0,0.5,5),cols,l)
    if(!isreal){O <- O + matrix(1i * cvreg::rpowexp(l*cols,0,1,1),cols,l)}
    Y <- X %*% O; rm(O)
    #Orthogonalize using QR decomposition
    if( q > 0 ) {
      for( i in 1:q) {
        Y <- qr.Q(qr(Y,complete=F));Z<-ifelse(is.complex(X),list(crossprod(Conj(X),Y)),list(crossprod(X,Y)))[[1]]
        Z <- qr.Q(qr(Z,complete=F));Y<-X%*%Z
      }
      rm(Z)
    }
    QB$Q <- qr.Q(qr(Y,complete=F)); is.real<-ifelse(is.complex(QB$Q),F,T)
  } else{
    QB$Q <- qr.Q(qr(X,complete=F)); is.real<-ifelse(is.complex(QB$Q),F,T)
  }
  if(!isreal){QB$B<-crossprod(Conj(QB$Q),X)}else{QB$B<-crossprod(QB$Q,X)}
  return(QB)
}

#' Calculate the orthogonal complement of a positive definite matrix
#'
#' @param x a matrix
#'
#' @return a matrix
#' @export
#'
#' @examples
#' orthCompl(x)
#'
orthCompl <- function(x) {
  x <- as.matrix(x)
  if (length(dimnames(x)[[2]])==0) dimnames(x)[[2]] <- paste("Col",1:ncol(x), sep=".")
  x <- cbind("Intercept"=1, x)
  qr.x <- qr(x)
  Q <- qr.Q(qr.x, complete=TRUE)
  dimnames(Q) <- list(paste("R",1:nrow(Q), sep="."),
                      paste("C",1:ncol(Q), sep="."))
  if (length(dimnames(x)[[1]])) dimnames(Q)[[1]] <- dimnames(x)[[1]]
  mincolx <- min(ncol(x), ncol(Q))
  if (length(dimnames(x)[[2]])) dimnames(Q)[[2]][1:mincolx] <- dimnames(x)[[2]][1:mincolx]
  sweep(Q, 2, apply(abs(Q[1:nrow(x),,drop=FALSE]), 2, sum)/2, "/")
}

#' Measure the distance between two orthonormal projections
#'
#' @param s1 subspace 1
#' @param s2 subspace 2
#'
#' @return a number
#' @export
#'
#' @examples
#' fit1 = SIR(prog ~ . -sex, diabetes)
#' fit2 = OPG(prog ~ . -sex, diabetes)
#' subspace.dist(fit1$basis, fit2$basis)
#'
subspace.dist = function(s1,s2){
  s1 <- as.matrix(s1)
  s2 <- as.matrix(s2)
  if (dim(s1)[2] == 1){
    p1 = s1%*%t(s1)/c(t(s1)%*%s1)
  }else{
    p1 <- s1%*%cvreg::matpower(t(s1)%*%s1,-1)%*%t(s1)
  }
  if (dim(s2)[2] == 1){
    p2 = s2%*%t(s2)/c(t(s2)%*%s2)
  }else{
    p2 <- s2%*%cvreg::matpower(t(s2)%*%s2,-1)%*%t(s2)
  }
  d = sqrt(sum((p1-p2)*(p1-p2)))
  return(d)
}

#' Measure the angle between two projections
#'
#' @param s1 subspace 1
#' @param s2 subspace 2
#'
#' @return a number
#' @export
#'
subspace.angle <- function (s1,s2) {
  if (!is.numeric(s1) || !is.numeric(s2))
    stop("Arguments 's1' and 's2' must be numeric matrices.")
  if (is.vector(s1))
    s1 <- matrix(c(s1), nrow = length(s1), ncol = 1)
  if (is.vector(s2))
    s2 <- matrix(c(s2), nrow = length(s2), ncol = 1)
  if (nrow(s1) != nrow(s2))
    stop("Matrices 's1' and 's2' must have the same number of rows.")
  s1 <- gram.schmidt(s1)$Q
  s2 <- gram.schmidt(s2)$Q
  if (ncol(s1) < ncol(s2)) {
    tmp <- s1
    s1 <- s2
    s2 <- tmp
  }
  for (k in 1:ncol(s1)) {
    s2 <- s2 - s1[, k] %*% t(s1[, k]) %*% s2
  }
  asin(min(1, svd(s2)$d))
}




#' Generalized T Kernel (kernlab compatible)
#'
#' @description This is an additional kernel function that can be used with kernlab functions. The generalized t kernel is defined as \deqn{\frac{1}{1 + ||x-y||^k}},
#' where k is the norm value.
#'
#' @param k the value for the norm. defaults to 3.
#' @param sigma the scale parameter
#'
#' @return a kernel class function
#' @export
#'
#' @examples
#' gtdot(4, 1)
#'
gtdot <- function (k=3,sigma=1) {
  rval <- function(x, y = NULL) {
    if (!is(x, "vector"))
      stop("x must be a vector")
    if (!is(y, "vector") && !is.null(y))
      stop("y must a vector")
    if (is(x, "vector") && is.null(y)) {
      return(1)
    }
    if (is(x, "vector") && is(y, "vector")) {
      if (!length(x) == length(y))
        stop("number of dimension must be the same on both data points")
      norm_vec <- function(x, k) sum(abs(x)^k)^(1/k)
      return(1 / ( 1 + (norm_vec(x-y, k)/sigma)))
    }
  }
  kern <- new("kernel", .Data = rval, kpar = list(k = k, sigma = sigma))
  attr(kern, "package") <- "kernlab"
  return(kern)
}




#' Spherical Kernel (kernlab compatible)
#'
#' @description This is a parameter that projects each pairwise difference onto a sphere.
#' It is similar to the process used in \code{\link{pcaSphere}} but is applied
#' to a kernel instead of the data.
#'
#'
#' @param theta a scale parameter
#' @return a kernel class function
#' @export
#'
#' @examples
#' spheredot(1)
#'
spheredot <- function(theta=1) {
  rval <- function(x,y=NULL) {
    if (!is(x, "vector"))
      stop("x must be a vector")
    if (!is(y, "vector") && !is.null(y))
      stop("y must a vector")
    if (is(x, "vector") && is.null(y)) {
      return(1)
    }
    if (is(x, "vector") && is(y, "vector")) {
      if (!length(x) == length(y))
        stop("number of dimension must be the same on both data points")
      norm_vec <- function(x) sqrt(sum(abs(x)^2))
      norm_xy <- norm_vec(x-y)
      if (norm_xy < theta){
        out = 1-(1.5*norm_xy/theta)+(0.5*((norm_xy/theta)^3));
      } else {
        out = 0;
      }
      return(out)
    }
  }
  kern <- new("kernel", .Data = rval, kpar = list(theta=theta))
  attr(kern, "package") <- "kernlab"
  return(kern)
}


#' Student's T (kernlab compatible)
#'
#' @description This is an additional kernel function that can be used with kernlab functions.
#' The Student's T kernel is defined as \deqn{\frac{1}{1 + \frac{||x-y||^nu }}}
#'
#' @param sigma a scale parameter
#'
#' @return a kernel class function
#' @export
#'
#' @examples
#' tdot(1) # yields the cauchy kernel
#'
tdot <- function (nu) {
  rval <- function(x, y = NULL) {
    if (!is(x, "vector"))
      stop("x must be a vector")
    if (!is(y, "vector") && !is.null(y))
      stop("y must a vector")
    if (is(x, "vector") && is.null(y)) {
      return(1)
    }
    if (is(x, "vector") && is(y, "vector")) {
      if (!length(x) == length(y))
        stop("number of dimension must be the same on both data points")
      norm_vec <- function(x, k) sum(abs(x)^k)^(1/k)
      return(1 / ( 1 + (norm_vec(x-y, nu))))
    }
  }
  kern <- new("kernel", .Data = rval, kpar = list(nu=nu))
  attr(kern, "package") <- "kernlab"
  return(kern)
}


#' ARD Matern Kernel (Kappa fixed at 3/2) (kernlab compatible)
#'
#' This is the automatic relevance detection variant of the materndot kernel
#' with the complexity parameter \eqn{\kappa} fixed at 3/2. It takes one user specified
#' parameter, sigma, which is set to 1 by default.
#'
#' @description This is an additional kernel function that can be
#' used with kernlab functions. The Matern kernel is defined as follows:
#'
#' \deqn{
#' K(x, y) = \sigma^2 * (1 + \sqrt(3)* ||x-y||) * exp(- \sqrt{3} * ||x-y||)
#' }
#'
#'
#'
#' @param sigma a scale parameter
#' @return a kernel class function
#' @export
#'
#' @examples
#' materndot(1)
#'
materndot <- function(sigma=1) {
  rval <- function(x, y = NULL) {
    if (!is(x, "vector"))
      stop("x must be a vector")
    if (!is(y, "vector") && !is.null(y))
      stop("y must a vector")
    if (is(x, "vector") && is.null(y)) {
      return(1)
    }
    if (is(x, "vector") && is(y, "vector")) {
      if (!length(x) == length(y))
        stop("number of dimension must be the same on both data points")
      p <- 1.5 - 0.5
      d <- sum((x - y)^2)^0.5
      sigma^2 * (1 + (sqrt(3)*d)) * exp((-sqrt(3)*d))
    }
  }
  kern <- new("kernel", .Data = rval, kpar = list(sigma=sigma))
  attr(kern, "package") <- "kernlab"
  return(kern)
}



#' Sigmoid Kernel (kernlab compatible)
#'
#' @description This is simply an alias for the tanh kernel in kernlab. It is commonly
#' known as the sigmoid kernel and this provides a means of accessing it
#' by a more familiar name to those who know it as the sigmoid kernel.
#' It is commonly used to represent categorical features.
#'
#' @param scale a scale parameter
#' @param offset an offset parameter
#' @return a kernel class function
#' @export
#'
#' @examples
#' sigdot(1,1)
#'
sigdot <- function(scale=1, offset=1) {
  tanhdot(scale, offset)
}

PearsonKernel <- function(x) {
  if (!is.factor(x)){
    cat(crayon::red("This kernel is for categorical vectors. Please supply a factor."))
  }
  x <- factor(x);
  z <- unlist(list(x))
  z <- as.numeric(z)
  x <- z[seq_along(x)];
  prop <- table(x) / length(x)
  unqx <- sort(unique(x))
  tmpx <- lapply(unqx, function(k) which(x == k))
  tmp <- lapply(seq_along(unqx), function(k) expand.grid(tmpx[[k]], tmpx[[k]]))
  res <- matrix(-1, nrow = length(x), ncol = length(x))
  for (i in seq_along(unqx)) {
    res[as.matrix(tmp[[i]])] <- 1 / prop[unqx[i]] - 1
  }
  return(res)
}



#' Vector Lp norm
#'
#' @param x a numeric vector
#' @param lp the desired norm. defaults to 2.
#'
#' @return a numeric value
#' @export
#'
vecnorm <- function (x, lp = 2) {
  stopifnot(is.numeric(x) || is.complex(x), is.numeric(lp))
  if (lp > -Inf && lp < Inf)
    sum(abs(x)^lp)^(1/lp)
  else if (lp == Inf)
    max(abs(x))
  else if (lp == -Inf)
    min(abs(x))
}



#' @name hdquantile
#' @title Returns the Harrell-Davis Quantile Estimates
#' @param x a vector
#' @param probs the quantiles desired
#' @description This returns the Harrell-Davis estimate of the median for a vector.
#' @returns a number
#' @export
#'
hdquantile <- function(x, probs = seq(0,1,len=5)){
  out<-as.vector(cvreg:::hdquantileCPP(x, probs))
  names(out)<-paste0(probs*100, "%")
  return(out)
}

#' Expectiles
#'
#' @description Expectiles, introduced by Newey and Powell (1987),
#' are analogs of the quantiles which minimize an asymmetric quadratic
#' objective function rather than the asymmetric linear objective function
#' which defines the quantiles.
#'
#' @param x a vector
#' @param probs the expectiles desired
#' @param hd if TRUE (the default) the Harrell-Davis method is used to refine the expectiles by passing the expectile
#' estimates to the empirical cumulative density function for the data to obtain the equivalent
#' quantile probabilities, then utilizing the \code{\link{hdquantile}} function. See the example for
#' more information.
#' @return a named numeric vector
#' @export
#' @examples
#' ## To understand how the Harrell-Davis method is used to refine the expectiles, first get the raw expectiles.
#' ex<-expectile(iris$Sepal.Length,hd=F)
#' ## Plug these estimates into the ecdf to obtain the quantile probabilites.
#' q<-ecdf(iris$Sepal.Length)(ex)
#' ## Pass the quantiles to the hdquantile function. These quantiles correspond to the expectiles.
#' hdquantile(iris$Sepal.Legth)
#' ## However,you can just use expectile(iris$Sepal.Length), as the hd adjustment is enabled by default.
#' @references
#' Newey, W., & Powell, J. (1987). Asymmetric Least Squares Estimation and Testing.
#'
expectile <- function (x, probs = seq(0, 1, len = 5), hd=T) {
  if (!is.vector(x))
    stop("observations are needed in vector form.")
  if (min(probs) < 0 || max(probs) > 1)
    stop("only asymmetries between 0 and 1 allowed.")
  e = mean(x)
  ee = 0 * probs
  g = max(abs(x)) * 1e-08
  for (k in 1:length(probs)) {
    p = probs[k]
    if (p == 0)
      ee[k] = min(x, na.rm = TRUE)
    else if (p == 1)
      ee[k] = max(x, na.rm = TRUE)
    else {
      for (it in 1:500) {
        w = ifelse(x < e, 1 - p, p)
        enew = sum(w * x)/sum(w)
        de = max(abs(enew - e))
        e = enew
        if (de < g)
          break
      }
      ee[k] = e
    }
  }
  if (hd){ee <- hdquantile(x, ecdf(x)(ee))}
  ee <- zapsmall(ee, 10)
  names(ee) <- paste0(probs*100, "%")
  return(ee)
}



#' Improved One Dimensional Root (Zero) Finding
#'
#' @description a "safe" version of uniroot() searches for a root (that is, zero) of the function f with respect to its first argument.
#' "safe" means searching for the correct interval = c(lower,upper) if sign(f(x)) does not satisfy the requirements at the interval end points
#' This function was adapted from the copula package.
#'
#' @param f 	the function for which the root is sought.
#' @param interval a vector containing the end-points of the interval to be searched for the root.
#' @param ... additional named or unnamed arguments to be passed to f
#' @param lower the lower end point of the interval to be searched
#' @param upper the upper end point of the interval to be searched
#' @param f.lower the same as f(lower)
#' @param f.upper the same as f(upper)
#' @param Sig desired sign of f(upper), or NULL
#' @param check.conv logical indicating whether a convergence warning of the underlying
#' uniroot should be caught as an error and if non-convergence in maxiter iterations should be an error instead of a warning.
#' @param tol the desired accuracy (convergence tolerance). defaults to 1e-3.
#' @param maxiter the maximum number of iterations. defaults to 2500.
#' @param trace integer number; if positive, tracing information is produced. Higher values giving more details.
#' @author Martin Maechler
#' @return
#' @export
#'
saferoot <- function(f, interval, ...,
                     lower = min(interval), upper = max(interval),
                     f.lower = f(lower, ...), f.upper = f(upper, ...),
                     Sig = NULL, check.conv = FALSE,
                     tol = 1e-3, maxiter = 2500, trace = 0)
{
  if (   is.null(Sig) && f.lower * f.upper > 0 ||
         is.numeric(Sig) && (Sig*f.lower > 0 || Sig*f.upper < 0)) {
    if(trace)
      cat(sprintf("search in [%g,%g]%s", lower, upper,
                  if(trace >= 2)"\n" else " ... "))
    Delta <- function(u) 0.01* pmax(1e-7, abs(u))
    ## Two cases:
    if(is.null(Sig)) {
      ## case 1) 'Sig' unspecified --> extend (lower, upper) at the same time
      delta <- Delta(c(lower,upper))
      while(isTRUE(f.lower*f.upper > 0) && any(iF <- is.finite(c(lower,upper)))) {
        if(iF[1]) f.lower <- f(lower <- lower - delta[1])
        if(iF[2]) f.upper <- f(upper <- upper + delta[2])
        if(trace >= 2)
          cat(sprintf(" .. modified lower,upper: (%15g,%15g)\n",
                      lower,upper))
        delta <- 2 * delta
      }
    } else {
      ## case 2) 'Sig' specified --> typically change only *one* of lower, upper
      ## make sure we have Sig*f(lower) < 0 and Sig*f(upper) > 0:
      delta <- Delta(lower)
      while(isTRUE(Sig*f.lower > 0)) {
        f.lower <- f(lower <- lower - delta)
        if(trace) cat(sprintf(" .. modified lower: %g\n", lower))
        delta <- 2 * delta
      }
      delta <- Delta(upper)
      while(isTRUE(Sig*f.upper < 0)) {
        f.upper <- f(upper <- upper + delta)
        if(trace) cat(sprintf(" .. modified upper: %g\n", upper))
        delta <- 2 * delta
      }
    }
    if(trace && trace < 2)
      cat(sprintf("extended to [%g, %g]\n", lower, upper))
  }
  if(!isTRUE(f.lower * f.upper <= 0))
    stop("did not succeed extending the interval endpoints for f(lower) * f(upper) <= 0")
  if(check.conv) {
    r <- tryCatch(uniroot(f, ..., lower=lower, upper=upper,
                          f.lower=f.lower, f.upper=f.upper,
                          tol=tol, maxiter=maxiter),
                  warning = function(w)w)
    if(inherits(r, "warning"))
      stop("convergence problem in zero finding: ", r$message)
    else r
  }
  else
    uniroot(f, ..., lower=lower, upper=upper,
            f.lower=f.lower, f.upper=f.upper,
            tol=tol, maxiter=maxiter)
}



#' Variable Screening With Diversity Induced Self-Representation
#'
#' @description DISR is a method of screening variables which selects
#' variables which balance being similar to other features in a set of
#' variables and being unique relative to other features. The balance
#' is controlled by the user via two parameters, 'self' and 'div'.
#'
#' @param x a matrix or data frame of numeric covariates.
#' @param nvar number of variables to retain
#' @param self a positive number which will be used to determine how heavily self-representation
#' is used to select variables. higher values relative to 'div' results in selection of features
#' that are more similar.
#' @param div a positive number which will be used to determine how heavily non-similarity to
#' other features is used to select variables. higher values relative to 'self' results in selection
#' of features that are more distinctive.
#' @param gamma the parameter that controls the learning rate. defaults to 1.2.
#' @param max_iter maximum number of iterations.
#' @param tol convergence tolerance
#'
#' @return a list of the loading matrix of selection indicators and a reduced 'x'.
#' @export
#'
#' @references
#' Liu, Y., Liu, K., Zhang, C., Wang, J., & Wang, X. (2017). Unsupervised feature selection via Diversity-induced Self-representation. Neurocomputing, 219, 350–363. doi:10.1016/j.neucom.2016.09.043
disr <- function(x, nvar = 3, self = 1, div = 1, gamma = 1.2, max_iter = 1000, tol = 1e-9){
  x <- as.matrix(x)
  idx <- cvreg:::disrCPP(x, self, div, gamma, max_iter, tol)
  idx <- order(idx, decreasing = TRUE)[1:nvar]
  newx <- x[,idx]
  disrSelector <- function(p,nvar,idx){
    mat = array(0,c(p,nvar))
    for (i in 1:nvar){j = as.integer(idx[i]); mat[j,i] = 1}
    return(mat)
  }
  loadings <- disrSelector(ncol(x), nvar, idx)
  rownames(loadings) <- colnames(x)
  colnames(loadings) <- paste0("DIM", 1:nvar)
  list(loadings = loadings, newx = newx)
}

#' Weighted Singular Value Decomposition
#'
#' @description The singular value decomposition of a matrix \strong{X} solves the equation
#' \strong{X = UDV'}, where \strong{U} is the matrix of \emph{left singular vectors},
#' \strong{D} is a diagonal matrix of \emph{singular values} (\eqn{\delta}), and \strong{V} is a
#' matrix of \emph{right singular vectors}.
#'
#' Note that \strong{U} is equivalent to the eigenvectors of the matrix \strong{XX'} and
#' \strong{V} is equivalent to the eigenvectors of the matrix \strong{X'X}. This implementation
#' of singular value decomposition has three main features: \cr
#'
#' \enumerate{
#'          \item{The decomposition of \strong{X} is sped up by first computing the SVD of either \strong{XX'} (p>n) or
#'          \strong{X'X} (p<n), rather than that of \strong{X}. For the case of p>n, this is \strong{XX' = U D^2 U'},
#'          and for the case of p<n, this is \strong{X'X = V D^2 V'}. This respectively gives the left or right
#'          singular vectors. For p>n, the right singular vectors are then obtained via \strong{V = X'UD^{-1}},
#'          and for p<n the left singular vectors are obtained via \strong{U = X'VD^{-1}}. The reason this approach
#'          improves speed of calculation for large data sets is that the SVD is only performed on a matrix with min(n,p) by min(n,p)
#'          dimensions, which can be substantially faster than performing SVD on a data set n by p dimensions.}
#'
#'          \item{The observations going into the SVD are allowed to be weighted. Currently weights on the variables
#'          are not implemented. The weighted left singular vectors are defined as \strong{\eqn{\Omega} = W^{-0.5}U}. The resulting
#'          singular values \strong{\eqn{\Delta}} are then the singular values of the weighted data matrix \strong{\eqn{\Phi}},
#'          where \strong{\eqn{\Phi} = \eqn{\Omega \Delta} V'}.}
#'
#'          \item{Only the singular values and singular vectors corresponding to \eqn{\delta} > tol, where tol is
#'          a tolerance for small values. Values less than or equal to tol are considered to be zero.}
#'          }
#'
#' @param x a matrix
#' @param weights row weights. defaults to equal weights and returns the typical SVD.
#' @param tol tolerance for small singular values. values <= tol are considered zero and truncated.
#' @param method whether the divide and conquer ("dc") or standard ("std")
#' algorithm should be used for the SVD.
#' @return a list
#' @export
#'
#' @examples
#' wsvd(iris[,-5])
#' @references Abdi, H. (2007) Singular Value Decomposition (SVD) and Generalized Singular Value Decomposition (GSVD). Encyclopedia of Measurement and Statistics.

wsvd <- function(x, weights = NULL, method = c("dc","std"), tol = 1e-4){

  method <- match.arg(method)
  x <- as.matrix(x)
  if (is.null(weights)) weights <- rep(1, nrow(x))

  fastSVD.wide <- function (x, tol = NULL, method){
    if (is.null(tol))
      tol = prod(dim(x)) * .Machine$double.eps
    s = cvreg:::fastSVDwide(x, method = method, eps = tol)
    s$d <- as.vector(s$d)
    return(s)
  }

  fastSVD.long <- function (x, tol = NULL, method){
    if (is.null(tol))
      tol = prod(dim(x)) * .Machine$double.eps
    s = cvreg:::fastSVDlong(x, method = method, eps = tol)
    s$d <- as.vector(s$d)
    return(s)
  }

  fastSVD.trunc <- function (x, tol = NULL, method){
    if (is.null(tol))
      tol = prod(dim(x)) * .Machine$double.eps
    s = cvreg:::svdCPP(x, method = method, eps = tol)
    s$d <- as.vector(s$d)
    return(s)
  }

  fastSVD = function (x, tol = NULL, method){
    n = nrow(x); p = ncol(x)
    if (n > 2 * p) {
      return(fastSVD.long(x, tol, method))
    }
    else if (2 * n < p) {
      return(fastSVD.wide(x, tol, method))
    }
    else {
      return(fastSVD.trunc(x, tol, method))
    }
  }
  if (all(weights==1)){
    return(fastSVD(x, tol, method))
  }
  else{
    if (is.vector(weights)) weights <- diag(weights)
    r <- list(weights=weights)
    x <- matpower(weights, 1/2) %*% x
    result <- fastSVD(x, tol, method)
    r$d <- result$d
    r$v <- result$v
    r$u <- matpower(weights, -1/2) %*% result$u
    r$weights <- diag(weights)
    if (all(r$v[, 1] < 0)) {
      r$u <- r$u * (-1)
      r$v <- r$v * (-1)
    }
    return(r[c("d", "u", "v", "weights")])
  }
}


#' L1-Norm Singular Value Decomposition
#'
#' @description In the classical singular value decomposition (SVD), the singular values
#' and the singular vectors are computed based on an L2 norm. However, the L2 norm is easily
#' misled by even a single sufficiently large outlier due to the values being squared. Another
#' option is to use an L1 norm, which works with absolute values. Unfortunately, the L1 norm
#' is not always as straightforward or fast to compute. This function implements the AL1 algorithm
#' of Hawkins (2001). The algoritm is capable of reaching the L1-norm solution to SVD, but
#' does so through an alternating L1-regression algorithm which is not as efficient as classical
#' SVD algorithms. For very large data sets, it will likely be too slow and inefficient for
#' practical use. The AL1 algorithm also does not sort singular vectors by the decreasing
#' order of the singular values, however, this is easily taken care of and this function
#' returns the sorted vectors and values. L1-normed singular vectors are also not necessarily
#' orthogonal, however, the implementation here allows for orthogonalizing the results.
#'
#' @references Hawkins, D.M., Liu, L., & Young, S.S. (2001) Robust Singular Value Decomposition, National Institute of Statistical Sciences, Technical Report Number
#'
#' @param x a data frame or matrix of numeric covariates
#' @param maxit the maximum number of iterations for the approximation algorithm. defaults to 500.
#' @param tol convergence tolerance. defaults to 1e-4
#' @param interp should values be interpolated during the calculation of L1 weighted svd? Since
#' the calculations entail calculating weighted medians, and weighted medians are not always unique,
#' this will smooth the values as to obtain a unique solution. this can come at the cost of accuracy,
#' however, as well as substantially slow down both the rate of convergence
#' and rate of calculations at each iteration, so the default is FALSE. However,
#' if convergence problems occur with this option set to FALSE, it may be worth trying
#' interp=TRUE.
#' @keywords internal
#' @return a list containing singular values and left and right singular vectors, along with
#' convergence information and number of iterations for each singular vector.
#' @export
#'
svdL1 <- function(x, maxit = 500, tol = 0.00001, interp=F) {
  n <- nrow(x);p <- ncol(x)
  X <- x <- as.matrix(x)

  L1RegCoef <- function(x,a){
    keep <- (a!=0) & (!is.na(x))
    a <- a[keep]
    return(matrixStats::weightedMedian(x[keep]/a,abs(a),
                                       na.rm = TRUE,
                                       interpolate=interp))
  }

  L1SingularVal <- function(x,a,b){
    x <- as.vector(x)
    ab <- as.vector(outer(a,b))
    keep <- (ab!=0) & (!is.na(x))
    ab <- ab[keep]
    return(matrixStats::weightedMedian(x[keep]/ab,abs(ab),
                                       na.rm=TRUE,
                                       interpolate=interp))
  }


  ## Initialize outputs
  ret <- list(u = matrix(NA, nrow=nrow(x), ncol=ncol(x)),
              v = matrix(NA,nrow=ncol(x),ncol=ncol(x)),
              d = rep(NA,ncol(x)),
              converged = rep(FALSE, ncol(x)),
              iter = rep(0, ncol(x))
             )

   for(j in 1:ncol(x)) {
    vj.ini <- apply(abs(x), 2, function(i) median(i))
    #vj.ini <- L1median(abs(x))$center
    vj <- vj.ini
    converged <- FALSE
    iter <- 0
    while(!any(converged, iter == maxit)){
      vjprev <- vj
      d <- apply(x,1,L1RegCoef,vj)
      uj <- d/sqrt(sum(d^2))
      e <- apply(x,2,L1RegCoef,uj)
      vj <- e/sqrt(sum(e^2))
      if(sum(abs(vj-vjprev)) < tol){
        ret$converged[j]<-converged<-TRUE
      }
      else{
        ret$converged[j]<-converged<-FALSE
      }
      ret$iter[j] <- iter <- iter + 1
    }
    eig <- L1SingularVal(x,uj,vj)
    x <- x - eig * tcrossprod(uj,vj)
    ret$u[,j] <- uj
    ret$v[,j] <- vj
    ret$d[j] <- eig
  }

  ord<-order(ret$d, decreasing = T)
  ret$u <- zapsmall(ret$u[,ord],14)
  ret$v <- zapsmall(ret$v[,ord],14)
  ret$d <- ret$d[ord]
  ret$converged <- ret$converged[ord]

  if (all(ret$v[, 1] < 0)) {
    ret$u <- ret$u * (-1)
    ret$v <- ret$v * (-1)
  }
  return(ret[c("d", "u", "v", "converged")])
}


#' weighted quantile

weighted.quantile <- function(x, w, probs=seq(0,1,0.25), na.rm=TRUE) {
  x <- as.numeric(as.vector(x))
  w <- as.numeric(as.vector(w))
  if(anyNA(x) || anyNA(w)) {
    ok <- !(is.na(x) | is.na(w))
    x <- x[ok]
    w <- w[ok]
  }
  stopifnot(all(w >= 0))
  if(all(w == 0)) stop("All weights are zero", call.=FALSE)
  oo <- order(x);x <- x[oo];w <- w[oo]
  Fx <- cumsum(w)/sum(w)
  result <- numeric(length(probs))
  for(i in seq_along(result)) {
    p <- probs[i]
    lefties <- which(Fx <= p)
    if(length(lefties) == 0) {
      result[i] <- x[1]
    } else {
      left <- max(lefties)
      result[i] <- x[left]
      if(Fx[left] < p && left < length(x)) {
        right <- left+1
        y <- x[left] + (x[right]-x[left]) * (p-Fx[left])/(Fx[right]-Fx[left])
        if(is.finite(y)) result[i] <- y
      }
    }
  }
  names(result) <- paste0(format(100 * probs, trim = TRUE), "%")
  return(result)
}


#' Check if a matrix is positive definite
#'
#' @description checks if the eigenvalues are positive real numbers
#' @param m a symmetric matrix
#' @param tol the tolerance value determines what eigenvalues are considered
#' essentially zero. the default is 1e-6.
#' @return a logical value
#' @export
#' @examples
#' pdcheck(x)
#'
pdcheck <- function(m, tol = 1e-6){
  if(!isSymmetric(m)){return(FALSE)}
  eval = eigen(m,T,T)$values
  if(!is.complex(eval) && all(eval > tol)){return(TRUE)}else{return(FALSE)}
}


#' Symmetrize a matrix with asymmetric numerical errors above/below diagonal
#'
#' @description Sometimes a matrix that ought to be symmetric will contain slight
#' numerical inaccuracies above or below the diagonal that result in a deviation
#' from symmetry. This function will fix that.
#' @param m a matrix
#' @return a matrix
#' @export
#'
symm <- function(m){(m+t(m))/2}




#' Matrix dotproduct
#'
#' @description Given matrices 'a' and 'b' as arguments, return a matrix cross-product.
#' @param a first matrix
#' @param b second matrix
#'
#' @return a vector
#' @export
#'
dotprod <- function (a, b) {
  if (length(a) == 0 && length(b) == 0)
    return(0)
  if (!(is.numeric(a) || is.complex(a)) || !(is.numeric(b) || is.complex(b)))
    stop("Arguments 'a' and 'b' must be real or complex.")
  a <- drop(a)
  b <- drop(b)
  if (any(dim(a) != dim(b)))
    stop("Matrices 'a' and 'b' must be of same size")
  if (is.vector(a) && is.vector(b)) {
    dim(a) <- c(length(a), 1)
    dim(b) <- c(length(b), 1)
  }
  dot <- colSums(Conj(a) * b)
  return(dot)
}



#' Coerce a non-positive definite symmetric matrix to positive definite
#'
#' @description This is a wrapper around the Matrix package's nearPD
#' function. If the diagonal of the original matrix is all 1s, this is
#' detected and assumed to indicate the input is a correlation matrix,
#' so the resulting output is converted to a correlation matrix via
#' cov2cor. If the diagonal is not all 1s, this is not applied.
#'
#' @param m a symmetric matrix
#' @param tol the tolerance value determines what eigenvalues are considered
#' essentially zero. the default is 1e-6.
#'
#' @return a matrix
#' @export
#'
mpd <- function(m, tol = 1e-6) {
  if (!pdcheck(m, tol)){
    rho <- all(diag(m) == 1)
    m <- try(as.matrix(Matrix::nearPD(m,eig.tol=tol,posd.tol=tol,do2eigen=T,doDykstra=T,conv.norm.type="2")$mat),silent=T)
    if (class(m)=="try.error"){
      m <- try(as.matrix(Matrix::nearPD(m,eig.tol=tol,posd.tol=tol,do2eigen=T,doDykstra=F,conv.norm.type="2")$mat),silent=T)
         if (class(m)=="try.error"){
           m  <- try(as.matrix(Matrix::nearPD(m,eig.tol=tol,do2eigen=T,doDykstra=F,conv.norm.type="2")$mat),silent=T)
           if (class(m)=="try.error"){
             m  <- as.matrix(Matrix::nearPD(m,eig.tol=tol,do2eigen=F,doDykstra=F,conv.norm.type="2")$mat)
           }
        }
    }
    m <- symm(m)
    if(rho){return(cov2cor(m))}else{return(m)}
  }
  else{
    return(symm(m))
  }
}


#' Bootstrap Confidence Intervals (Bias-Corrected & Accelerated)
#'
#' @description This function implements the procedure described by
#' DiCiccio and Efron (1996) to calculate the bias-corrected and
#' accelerated confidence intervals using only the bootstrap or
#' monte carlo samples. This function is fast, and gives more accurate
#' intervals than simply taking the quantiles of the bootstrap samples.
#'
#' @param samps a vector of the bootstrap or monte carlo samples for a single parameter.
#' @param conf.level the confidence level. defaults to 0.95
#'
#' @return a vector of two numeric values containing the interval endpoints
#' @export
#'
bci <- function(samps, conf.level = .95){
  low <- (1 - conf.level)/2
  high <- 1 - low
  sims <- length(samps)
  z.inv <- length(samps[samps < mean(samps)])/(sims+1)
  z <- qnorm(z.inv)
  U <- (sims) * (mean(samps, na.rm=TRUE) - samps)
  top <- sum(U^3)
  under <- 6 * (sum(U^2))^{3/2}
  a <- top / under
  lower.inv <-  pnorm(z + (z + qnorm(low))/(1 - a * (z + qnorm(low))))
  lower <- hdquantile(samps, lower.inv)
  upper.inv <-  pnorm(z + (z + qnorm(high))/(1 - a * (z + qnorm(high))))
  upper <- hdquantile(samps, upper.inv)
  return(c(lower, upper))
}

#' Calculate p-values from bootstrap or MCMC samples
#' @noRd
#' @export
exact_pvalue <- function(samples) {
  if (length(dim(samples)) == 0) {
    std <- backsolve(chol(var(samples)), cbind(0, t(samples)) -
                       mean(samples), transpose = TRUE)
    sqdist <- colSums(std * std)
    (sum(sqdist[-1] > sqdist[1]) + 1)/(length(samples)+1)
  }
  else {
    std <- backsolve(chol(var(samples)), cbind(0, t(samples)) -
                       colMeans(samples), transpose = TRUE)
    sqdist <- colSums(std * std)
    sum(sqdist[-1] > sqdist[1]) + 1/(nrow(samples) + 1)
  }
}


#' Ridge-Tikhonov Regularized Matrix Power for Symmetric Matrices
#' @noRd
#' @export
ridgepow <- function(m, pow){
  ridgepower <- function(m,lambda,p){
    onorm<-eigen(symm(m),T,T)$values[1]
    return(matpower(m+lambda*onorm*diag(nrow(m)),p))
  }
  lambda = seq(2e-3,0.25,len=24);pd=FALSE;iter=0
  while (isFALSE(pd) && iter < 25){
    iter <- iter + 1
    mp <- suppressWarnings(suppressMessages(try(ridgepower(m,lambda[iter],pow))))
    if (class(mp)=="try.error"){pd <- FALSE}else{pd <- pdcheck(mp)}
  }
  return(mp)
}

#' Get standard errors from a hessian matrix
#' @noRd
#' @export
hessian2se <- function(a, vcov=FALSE) {
  namesp <- colnames(a)
  hess <- a
  hess <- ifelse(hess==-Inf, -1E9, hess)
  hess <- ifelse(hess==+Inf, 1E9, hess)
  sigma  <- try(solve(hess), silent=TRUE)
  if (class(sigma) == "try-error"){
    hess <- mpd(hess); VCOV <- sigma <- try(pseudoinverse(hess), silent=TRUE)
  }
  if (class(sigma) != "try-error") {
    if (all(diag(sigma)>=0)) {
      VCOV <- mpd(sigma)
      res <- sqrt(diag(VCOV))
    } else {
      s. <- svd(sigma)
      R <- t(s.$v %*% (t(s.$u) * sqrt(pmax(s.$d, 0))))
      res <- structure((matrix(rep(1, nrow(R)), nrow = 1, byrow = TRUE) %*% R)[1, ], .Names=colnames(hess))
      if (any(res<0)) {
        if (requireNamespace("Matrix", quietly = TRUE)) {
          d <- diag(mpd(sigma))
          names(d) <- colnames(hess)
          res <- ifelse(d<0, NA, sqrt(d))
        } else {
          res <- NA
        }
        if (any(is.na(res))) {
          a <- sigma
          n = dim(a)[1];
          root = matrix(0,n,n);
          for (i in 1:n){
            sum = 0;
            if (i>1){sum = sum(root[i,1:(i-1)]^2)};
            x = a[i,i] - sum;
            if (x<0){x = 0;}
            root[i,i] = sqrt(x);

            if (i < n){
              for (j in (i+1):n){
                if (root[i,i] == 0){
                  x=0;
                }
                else{
                  sum = 0;
                  if (i>1) {
                    sum = root[i,1:(i-1)] %*% t(t(root[j,1:(i-1)]))
                  }
                  x = (a[i,j] - sum)/root[i,i];
                }
                root[j,i] = x;
              }
            }
          }
          colnames(root) <- rownames(root) <- colnames(hess)
          VCOV <- pseudoV <- tcrossprod(root)
          d <- diag(pseudoV)
          if (any(d != 0) & all(d >= 0)) {
            res <- sqrt(diag(pseudoV))
          } else {
            newMat <- a
            cholError <- TRUE
            iter <- 0
            while (cholError) {
              iter <- iter + 1
              diag(newMat) <- diag(newMat)+1e-24
              newMat <- newMat/sqrt(diag(newMat) %*% t(diag(newMat)))
              cholStatus <- try(u <- chol(newMat), silent = TRUE)
              cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE)
            }
            root <- cholStatus
            colnames(root) <- rownames(root) <- colnames(hess)
            VCOV <- pseudoV <-tcrossprod(root)
            res <- sqrt(diag(pseudoV))
          }
        }
      }
    }
  }
  SEInf <- namesp[!namesp %in% names(res)]
  res <- c(res, structure(rep(+Inf, length(SEInf)), .Names=SEInf))
  if (vcov) {return(list(SE=res, vcov=VCOV))} else {return(res)}
}


#' Format very large or small numbers as approximate fractions
#'
#' @param num a vector of numbers
#' @param digits the number of digits to use.
#'
#' @return a character vector
#' @export
#'
formatNumber <- function (num, digits = "default") {
  round.num <- function(num, digits = "default") {
    if (digits == "default") {
      if (num < 1/1000)
        result <- "1/1000"
      if ((num >= 1/1000) & (num <= 1/10))
        result <- paste("1/", as.character(round(1/num)), sep = "")
      if ((num > 1/10) & (num < 1))
        result <- paste("1/", as.character(round(1/num, digits = 1)), sep = "")
      if ((num < 10) & (num >= 1))
        result <- as.character(round(num, digits = 1))
      if ((num >= 10) & (num <= 9999))
        result <- as.character(round(num))
      if (num > 9999)
        result <- "> 9999"
    }
    else {
      if (num < 1)
        result <- paste("1/", as.character(round(1/num, digits = digits)), sep = "")
      else result <- as.character(round(num, digits = digits))
    }
    return(result)
  }
  result <- character()
  for (i in 1:length(num)) result[i] <- round.num(num[i], digits = digits)
  return(result)
}

#' Center a Kernel Matrix
#'
#' @param kern a kernel matrix. can be a matrix class object or a kernelMatrix class
#' object (from the kernlab package).
#'
#' @return a centered kernel matrix.
#' @export
#'
#' @examples
#' k <- kernelMatrix(cauchydot(1), X)
#' kc <- centerKernel(k)
centerKernel <- function(kern){
  R <- apply(kern, 1, sum)
  C <- apply(kern, 2, sum)
  S <- sum(kern)
  N <- ncol(kern)
  kern - R / N - rep(C, each = nrow(kern)) / N + S / (N ^ 2)
}


#' Convert a kernel matrix to a distance matrix
#'
#' @param K a kernel matrix
#' @param center should the kernel be centered first? defaults to TRUE.
#' @param method one of "cs" or "norm"
#'
#' @return a matrix
#' @export
#'
kern2dist <- function(K, center = TRUE, method = c("cs", "norm")){

  method <- match.arg(method)
  num <- nrow(K)
  index <- 1:num
  comb.index <- combn(index,2)
  D <- matrix(0, nrow = nrow(K), ncol = ncol(K))
  if (center){K <- centerKernel(K)}
  eig <- svd(K)
  rank <- sum(eig$d)
  eig$d[eig$d<=0] <- 1e-4
  K <- eig$u%*%diag(eig$d)%*%t(eig$v)
  K <- symm(K)

  if(method=="norm"){
    apply(comb.index,2,FUN=function(x){
      valxx <- K[x[1],x[1]]
      valxy <- K[x[1],x[2]]
      valyy <- K[x[2],x[2]]
      D[x[1],x[2]]<<-D[x[2],x[1]]<<- sqrt((valxx-2*valxy+valyy)^2)
    })
    diag(D) <- 0
  }else if(method=="cs"){
    apply(comb.index,2,FUN=function(x){
      valxx <- K[x[1],x[1]]
      valxy <- K[x[1],x[2]]
      valyy <- K[x[2],x[2]]
      D[x[1],x[2]]<<-D[x[2],x[1]]<<- acos(valxy^2/(valxx*valyy))
    })
    diag(D) <- 0
  }
  D[which(is.na(D))] <- max(D,na.rm=TRUE)
  return(D)
}

#' Convert a distance matrix to a kernel matrix
#'
#' @param D a distance matrix
#' @param center should the kernel be centered before returning? defaults to TRUE.
#'
#' @return a matrix
#' @export
#'
dist2kern <- function(D, center = TRUE){
  D <- as.matrix(D)
  num <- nrow(D)
  index <- 1:num
  comb.index <- combn(index,2)
  K <- matrix(0, nrow = nrow(D), ncol = ncol(D))
  apply(comb.index,2,FUN=function(x){
      K[x[1],x[2]]<<-K[x[2],x[1]]<<- (-D[x[1],x[2]]+sum(D[,x[2]])/num+sum(D[,x[1]])/num - sum(D)/(num^2))/2
    })
  if(center){K <- centerKernel(K)}
  eig <- svd(K)
  eig$d[eig$d<=0] <- 1e-6
  K <- eig$u%*%tcrossprod(diag(eig$d), eig$v)
  K <- try(mpd(symm(K)), silent = T)
  if (class(K)=="try.error"){
    eig$d <- eig$d + 1e-8; eig$d[eig$d<=1e-4] <- 1e-4
    K <- eig$u%*%tcrossprod(diag(eig$d), eig$v)
  }
  return(K)
}


#' Minkowski–Bouligand Box Counting Dimension
#'
#' @description This estimates the minimum number of variables needed
#' to represent a data set, the fractal dimension. The returned value
#' is not always an integer, so take ceiling(estimate) if using to select
#' a number of variables.
#'
#' @param X the data frame or matrix of numeric covariates
#' @param sizes number of box sizes to test. defaults to the
#' number of columns in X.
#' @param plot should a visualization of \deqn{fractal dim(X) =
#' \lim \frac{\log N(r)}{\log (1/r)}} be shown? defaults to F.
#' @return a list containing the estimated fractal dimension,
#' the box sizes, and the radii of the boxes
#' @export
#'
dimbox <- function(X, sizes = ncol(X), plot = F){
  cut=c(0.1,0.9)
  X = as.matrix(X)
  n = nrow(X)
  d = ncol(X)
  sizes = as.integer(sizes)
  Is     = col_minmax(X,0.1)
  rstart = ((min(Is[2,]-Is[1,])) + (max(Is[2,]-Is[1,])))/2
  maxlength = floor(1-log2(max(1e-15,1e-10/rstart)))
  maxlength = min(sizes,maxlength)
  vecr = as.vector(array(0,c(1,maxlength)))
  vecN = as.vector(array(0,c(1,maxlength)))
  Imin = as.vector(Is[1,])
  tX   = t(X)
  for (i in 1:maxlength){
    currentr = rstart*(0.75^(i-1))
    counted  = round(cvreg:::boxcount_util(tX,Imin,currentr))
    vecr[i] = currentr
    vecN[i] = sum(!duplicated(counted))
  }
  if (length(which(vecN==max(vecN)))>2){
    minidx = min(which(vecN==max(vecN)))
    vecr = vecr[1:(minidx+1)]
    vecN = vecN[1:(minidx+1)]
  }
  conversion = .approxBoxcount(vecr, vecN, sizes)
  vecr   = conversion$r
  vecN   = conversion$Nr
  if ((!is.vector(cut))||(length(cut)!=2)||(any(cut<=0))||(any(cut>=1))){
    stop("'cut' should be a vector of length 2.")
  }
  idxtrim = .fractalTrimmer(vecN, cut)
  vecrtrim = vecr[idxtrim]
  vecNtrim = vecN[idxtrim]
  nnn      = length(vecrtrim)
  estdim   = sum(coef(MASS::rlm(log(vecNtrim)~log(1/vecrtrim)))[2])
  output = list()
  output$estimated.dimension = estdim
  output$radii  = vecr
  output$dimensions = vecN
  if (plot) plot(log(1/output$radii),log(output$dimensions), type = "l", lwd = 2.25, col = "red")
  return(output)
}



RV <- function(a, b, adj = F){
  n <- nrow(a)
  if (nrow(b) != n || ncol(a) != ncol(b)){
    return(cat(crayon::red("A and B must have the same dimensions")))
  }
  AA <- tcrossprod(a)
  BB <- tcrossprod(b)
  RV <- sum(diag(AA%*%BB)) / (sum(diag(AA%*%AA))*sum(diag((BB%*%BB))))^0.5
  if(adj){
    mrvB <- sqrt(sum(diag(AA))^2 / sum(diag(AA%*%AA))) * sqrt(sum(diag(BB))^2 / sum(diag(BB%*%BB))) / (n-1)
    RV  <- (RV - mrvB) / (1 - mrvB)
  }
  return(RV)
}


.approxBoxcount <- function(r, Nr, sizes){
  x = log(1/r)
  y = log(Nr)
  interp = stats::approx(x,y,n=sizes,method="linear")
  xnew   = interp$x
  ynew   = interp$y
  output = list()
  output$r = exp(-xnew)
  output$Nr = exp(ynew)
  return(output)
}

.fractalTrimmer <- function(counts, cut){
  cut = sort(cut)
  counts = log(counts)
  fmin   = min(counts[is.finite(counts)])
  if (fmin <= 0){data = counts + abs(fmin)}else{data = counts}
  finite   = which(is.finite(data))
  maxvalue = max(data[is.finite(data)])
  thr1     = maxvalue*min(cut)
  thr2     = maxvalue*max(cut)
  idx = intersect(intersect(which(data>=thr1), which(data<=thr2)), finite)
  return(idx)
}
abnormally-distributed/cvreg documentation built on May 3, 2020, 3:45 p.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
abnormally-distributed/cvreg
Cross Validation and Robust Estimation Utilities

R/utilities.R
In abnormally-distributed/cvreg: Cross Validation and Robust Estimation Utilities

Documented in aad bci centerKernel colMads colMedians cor2cov dimbox disr dist2kern hdquantile kern2dist lmSolve make_colors match.closest materndot matpower mpd orthCompl pbLocScale pdcheck qb saferoot sigdot spheredot subspace.dist svdL1 symm tauLocScale weighted.quantile wsvd

R Package Documentation

Browse R Packages

We want your feedback!

abnormally-distributed/cvreg Cross Validation and Robust Estimation Utilities

R/utilities.R In abnormally-distributed/cvreg: Cross Validation and Robust Estimation Utilities

Documented in aad bci centerKernel colMads colMedians cor2cov dimbox disr dist2kern hdquantile kern2dist lmSolve make_colors match.closest materndot matpower mpd orthCompl pbLocScale pdcheck qb saferoot sigdot spheredot subspace.dist svdL1 symm tauLocScale weighted.quantile wsvd

R Package Documentation

Browse R Packages

We want your feedback!

abnormally-distributed/cvreg
Cross Validation and Robust Estimation Utilities

R/utilities.R
In abnormally-distributed/cvreg: Cross Validation and Robust Estimation Utilities