R/cv.dcsvm.R
In dcsvm: Density Convoluted Support Vector Machines

Documented in coef.cv.dcsvm cv.dcsvm cv.epanechnikov cv.gaussian cv.uniform plot.cv.dcsvm predict.cv.dcsvm

#' Cross-Validation for Sparse Density-Convoluted SVM
#'
#' Conducts a k-fold cross-validation for \code{\link{dcsvm}} and returns the suggested values of the L1 parameter \code{lambda}.
#'
#' @name cv.dcsvm
#' @aliases cv.dcsvm
#' @title Cross-Validation for Sparse Density-Convoluted SVM
#'
#' @description
#' Performs cross-validation for the sparse density-convoluted SVM to estimate the optimal tuning parameter \code{lambda}.
#'
#' @usage
#' cv.dcsvm(x, y, lambda = NULL, hval = 1, 
#'   pred.loss = c("misclass", "loss"), nfolds = 5, foldid, ...)
#'
#' @param x A matrix of predictors, i.e., the \code{x} matrix used in \code{\link{dcsvm}}.
#' @param y A vector of binary class labels, i.e., the \code{y} used in \code{\link{dcsvm}}.
#' @param lambda Default is \code{NULL}, and the sequence generated by \code{\link{dcsvm}} is used. User can also provide a new \code{lambda} sequence for cross-validation.
#' @param hval The bandwidth parameter for kernel smoothing. Default is 1.
#' @param pred.loss \code{"misclass"} for classification error, \code{"loss"} for the density-convoluted SVM loss.
#' @param nfolds The number of folds. Default is 5. The allowable range is from 3 to the sample size. Larger \code{nfolds} increases computational time.
#' @param foldid An optional vector with values between 1 and \code{nfold}, representing the fold indices for each observation. If supplied, \code{nfolds} can be missing.
#' @param ... Other arguments that can be passed to \code{\link{dcsvm}}.
#'
#' @details
#' This function runs \code{\link{dcsvm}} on the sparse density-convoluted SVM by excluding each fold in turn, then computes the mean cross-validation error and standard deviation. It is adapted from the \code{cv} functions in the \code{gcdnet} and \code{glmnet} packages.
#'
#' @return
#' A \code{\link{cv.dcsvm}} object is returned, which includes the cross-validation fit:
#'   \item{lambda}{The \code{lambda} sequence used in \code{\link{dcsvm}}.}
#'   \item{cvm}{A vector of length \code{length(lambda)} for the mean cross-validated error.}
#'   \item{cvsd}{A vector of length \code{length(lambda)} for estimates of standard error of \code{cvm}.}
#'   \item{cvupper}{The upper curve: \code{cvm + cvsd}.}
#'   \item{cvlower}{The lower curve: \code{cvm - cvsd}.}
#'   \item{nzero}{Number of non-zero coefficients at each \code{lambda}.}
#'   \item{name}{"Mis-classification error", for plotting purposes.}
#'   \item{dcsvm.fit}{A fitted \code{\link{dcsvm}} object using the full data.}
#'   \item{lambda.min}{The \code{lambda} incurring the minimum cross-validation error \code{cvm}.}
#'   \item{lambda.1se}{The largest value of \code{lambda} such that error is within one standard error of the minimum.}
#'   \item{cv.min}{The minimum cross-validation error.}
#'   \item{cv.1se}{The cross-validation error associated with \code{lambda.1se}.}
#'
#' @seealso
#' \code{\link{dcsvm}}, \code{\link{plot.cv.dcsvm}}, \code{\link{predict.cv.dcsvm}}, and \code{\link{coef.cv.dcsvm}} methods.
#'
#' @examples
#' data(colon)
#' colon$x <- colon$x[ ,1:100] # Use only the first 100 columns for this example
#' n <- nrow(colon$x)
#' set.seed(1)
#' id <- sample(n, trunc(n / 3))
#' cvfit <- cv.dcsvm(colon$x[-id, ], colon$y[-id], lam2=1, nfolds=5)
#' plot(cvfit)
#' predict(cvfit, newx=colon$x[id, ], s="lambda.min")
#'
#' @export
cv.dcsvm <- function(x, y, lambda=NULL, hval=1,
  pred.loss=c("misclass", "loss"), nfolds=5, foldid, ...) {
  ####################################################################
  ## data setup
  if (missing(pred.loss)) 
    pred.loss <- "misclass" else pred.loss <- match.arg(pred.loss)
  if (!match(pred.loss, c("misclass", "loss"), FALSE)) {
    warning("Only 'misclass' and 'loss' available for 
            DWD; 'misclass' used.")
    pred.loss <- "misclass"
  }
  y <- c(-1, 1)[as.factor(drop(y))]
  x <- as.matrix(x)
  if (!all(y %in% c(-1, 1))) 
    stop("y should be a factor with two levels.")
  x.row <- as.integer(NROW(x))
  if (length(y) != x.row) 
    stop("x and y have different number of observations.")  
  ####################################################################
  dcsvm.object <- dcsvm(x, y, lambda=lambda, hval=hval, ...)
  lambda <- dcsvm.object$lambda
  nz <- sapply(coef(dcsvm.object, type="nonzero"), length)
  if (missing(foldid)) 
    foldid <- sample(rep(seq(nfolds), 
      length=x.row)) else nfolds <- max(foldid)
  if (nfolds < 3) 
    stop("nfolds must be bigger than 3; nfolds=5 recommended.")
  outlist <- as.list(seq(nfolds))
  ## fit the model nfold times and save them
  for (i in seq(nfolds)) {
    which <- foldid == i
    outlist[[i]] <- dcsvm(x=x[!which, , drop=FALSE], 
      y=y[!which], lambda=lambda, hval=hval, ...)
  }
  ## select the lambda according to predmat
  fun <- paste("cv", class(dcsvm.object)[[2]], sep=".")
  cvstuff <- do.call(fun, 
    list(outlist, lambda, x, y, foldid, pred.loss, hval))
  cvm <- cvstuff$cvm
  cvsd <- cvstuff$cvsd
  cvname <- cvstuff$name
  out <- list(lambda=lambda, cvm=cvm, cvsd=cvsd, 
    cvupper=cvm+cvsd, cvlower=cvm - cvsd, nzero=nz, 
    name=cvname, dcsvm.fit=dcsvm.object)
  obj <- c(out, as.list(getmin(lambda, cvm, cvsd)))
  class(obj) <- "cv.dcsvm"
  obj
} 

cv.gaussian <- function(outlist, lambda, x, y, foldid, pred.loss, hval) {
  ### Turn y into c(-1,1)
  y <- as.factor(y)
  y <- c(-1, 1)[as.numeric(y)]
  nfolds <- max(foldid)
  predmat <- matrix(NA, length(y), length(lambda))
  nlams <- double(nfolds)
  for (i in seq(nfolds)) {
    which <- foldid == i
    fitobj <- outlist[[i]]
    preds <- predict(fitobj, x[which, , drop=FALSE], type="link")
    nlami <- length(outlist[[i]]$lambda)
    predmat[which, seq(nlami)] <- preds
    nlams[i] <- nlami
  }
  typenames <- c(misclass <- "mis-classification error",
                loss     <- "SVM with gaussian smoothing loss")
  cvraw <- switch(pred.loss, 
    loss     = loss.gaussian(y * predmat, hval),
    misclass = (y != ifelse(predmat > 0, 1, -1)))
  if (length(y)/nfolds >= 3) {
    cvob <- cvcompute(cvraw, foldid, nlams)
    cvraw <- cvob$cvraw
    cvn <- cvob$N
  } else cvn <- length(y) - colSums(is.na(predmat))    
  cvm <- colMeans(cvraw, na.rm=TRUE)
  cvsd <- sqrt(colMeans(scale(cvraw, cvm, FALSE)^2, 
    na.rm=TRUE)/(cvn - 1))
  list(cvm = cvm, cvsd = cvsd, name = typenames[pred.loss])
} 

cv.uniform <- function(outlist, lambda, x, y, foldid, pred.loss, hval) {
  ### Turn y into c(-1,1)
  y <- as.factor(y)
  y <- c(-1, 1)[as.numeric(y)]
  nfolds <- max(foldid)
  predmat <- matrix(NA, length(y), length(lambda))
  nlams <- double(nfolds)
  for (i in seq(nfolds)) {
    which <- foldid == i
    fitobj <- outlist[[i]]
    preds <- predict(fitobj, x[which, , drop=FALSE], type="link")
    nlami <- length(outlist[[i]]$lambda)
    predmat[which, seq(nlami)] <- preds
    nlams[i] <- nlami
  }
  typenames <- c(misclass <- "mis-classification error",
                loss     <- "SVM with uniform smoothing loss")
  cvraw <- switch(pred.loss, 
    loss     = loss.uniform(y * predmat, hval),
    misclass = (y != ifelse(predmat > 0, 1, -1)))
  if (length(y)/nfolds >= 3) {
    cvob <- cvcompute(cvraw, foldid, nlams)
    cvraw <- cvob$cvraw
    cvn <- cvob$N
  } else cvn <- length(y) - colSums(is.na(predmat))    
  cvm <- colMeans(cvraw, na.rm=TRUE)
  cvsd <- sqrt(colMeans(scale(cvraw, cvm, FALSE)^2, 
    na.rm=TRUE)/(cvn - 1))
  list(cvm = cvm, cvsd = cvsd, name = typenames[pred.loss])
} 

cv.epanechnikov <- function(outlist, lambda, x, y, foldid, pred.loss, hval) {
  ### Turn y into c(-1,1)
  y <- as.factor(y)
  y <- c(-1, 1)[as.numeric(y)]
  nfolds <- max(foldid)
  predmat <- matrix(NA, length(y), length(lambda))
  nlams <- double(nfolds)
  for (i in seq(nfolds)) {
    which <- foldid == i
    fitobj <- outlist[[i]]
    preds <- predict(fitobj, x[which, , drop=FALSE], type="link")
    nlami <- length(outlist[[i]]$lambda)
    predmat[which, seq(nlami)] <- preds
    nlams[i] <- nlami
  }
  typenames <- c(misclass <- "mis-classification error",
                loss     <- "SVM with epanechnikov smoothing loss")
  cvraw <- switch(pred.loss, 
    loss     = loss.epanechnikov(y * predmat, hval),
    misclass = (y != ifelse(predmat > 0, 1, -1)))
  if (length(y)/nfolds >= 3) {
    cvob <- cvcompute(cvraw, foldid, nlams)
    cvraw <- cvob$cvraw
    cvn <- cvob$N
  } else cvn <- length(y) - colSums(is.na(predmat))    
  cvm <- colMeans(cvraw, na.rm=TRUE)
  cvsd <- sqrt(colMeans(scale(cvraw, cvm, FALSE)^2, 
    na.rm=TRUE)/(cvn - 1))
  list(cvm = cvm, cvsd = cvsd, name = typenames[pred.loss])
} 

#' Make Predictions from a "cv.dcsvm" Object
#'
#' This function predicts the class labels of new observations using the sparse density-convoluted SVM at the \code{lambda} values suggested by \code{\link{cv.dcsvm}}.
#'
#' @name predict.cv.dcsvm
#' @aliases predict.cv.dcsvm
#' @title Make Predictions from a "cv.dcsvm" Object
#'
#' @description
#' Predicts class labels for new data based on the cross-validated \code{lambda} values from a \code{cv.dcsvm} object.
#'
#' @usage
#' \method{predict}{cv.dcsvm}(object, newx, s = c("lambda.1se", "lambda.min"), ...)
#'
#' @param object A fitted \code{\link{cv.dcsvm}} object.
#' @param newx A matrix of new values for \code{x} at which predictions are to be made. Must be a matrix. See documentation for \code{predict.dcsvm}.
#' @param s Value(s) of the L1 tuning parameter \code{lambda} for making predictions. Default is \code{s = "lambda.1se"} saved in the \code{cv.dcsvm} object. An alternative choice is \code{s = "lambda.min"}. \code{s} can also be numeric, representing the specific value(s) to use.
#' @param ... Not used. Other arguments to \code{predict}.
#'
#' @details
#' This function uses the cross-validation results to make predictions. It is adapted from the \code{predict.cv} function in the \code{glmnet} and \code{gcdnet} packages.
#'
#' @return
#' Predicted class labels or fitted values, depending on the choice of \code{s} and any arguments passed to the \code{\link{dcsvm}} method.
#'
#' @seealso
#' \code{\link{cv.dcsvm}}, and \code{\link{coef.cv.dcsvm}} methods.
#'
#' @examples
#' data(colon)
#' colon$x <- colon$x[ , 1:100] # Use only the first 100 columns for this example
#' set.seed(1)
#' cv <- cv.dcsvm(colon$x, colon$y, lam2=1, nfolds=5)
#' predict(cv$dcsvm.fit, newx=colon$x[2:5, ], 
#'   s=cv$lambda.1se, type="class")
#'
#' @export
predict.cv.dcsvm <- function(object, newx, s=c("lambda.1se", 
    "lambda.min"), ...) {
  if (is.numeric(s)) 
    lambda <- s else if (is.character(s)) {
      s <- match.arg(s)
      lambda <- object[[s]]
    } else stop("Invalid form for s")
  predict(object$dcsvm.fit, newx, s=lambda, ...)
} 

#' Plot the Cross-Validation Curve of Sparse Density-Convoluted SVM
#'
#' Plots the cross-validation curve against a function of \code{lambda} values, including upper and lower standard deviation curves.
#'
#' @name plot.cv.dcsvm
#' @aliases plot.cv.dcsvm
#' @title Plot the Cross-Validation Curve of Sparse Density-Convoluted SVM
#'
#' @description
#' Depicts the cross-validation curves for the sparse density-convoluted SVM.
#'
#' @usage
#' \method{plot}{cv.dcsvm}(x, sign.lambda, ...)
#'
#' @param x A fitted \code{\link{cv.dcsvm}} object.
#' @param sign.lambda Specifies whether to plot against \code{log(lambda)} (default) or its negative if \code{sign.lambda = -1}.
#' @param ... Other graphical parameters to \code{plot}.
#'
#' @details
#' This function visualizes the cross-validation curves for a \code{cv.dcsvm} object, which plots the relationship between \code{lambda} values and cross-validation error.
#'
#' @return
#' No return value, only called for plots.
#' 
#' @seealso
#' \code{\link{cv.dcsvm}}.
#'
#' @examples
#' data(colon)
#' colon$x <- colon$x[ ,1:100] # Use only the first 100 columns for this example
#' set.seed(1)
#' cv <- cv.dcsvm(colon$x, colon$y, lam2=1, nfolds=5)
#' plot(cv)
#'
#' @export
plot.cv.dcsvm <- function(x, sign.lambda=1, ...) {
  cvobj <- x
  xlab <- "log(Lambda)"
  if (sign.lambda < 0) 
    xlab <- paste0("-", xlab)
  plot.args <- list(x=sign.lambda * log(cvobj$lambda), 
    y=cvobj$cvm, type="n", xlab=xlab, ylab=cvobj$name,
    ylim=range(cvobj$cvupper, cvobj$cvlo))
  new.args <- list(...)
  if (length(new.args)) 
    plot.args[names(new.args)] <- new.args
  do.call("plot", plot.args)
  error.bars(sign.lambda * log(cvobj$lambda), cvobj$cvupper, 
    cvobj$cvlo, width=0.01, col="darkgrey")
  points(sign.lambda * log(cvobj$lambda), cvobj$cvm, 
    pch=20, col="red")
  axis(side=3, at=sign.lambda * log(cvobj$lambda), 
    labels=paste(cvobj$nz), tick=FALSE, line=0)
  abline(v=sign.lambda * log(cvobj$lambda.min), lty=3)
  abline(v=sign.lambda * log(cvobj$lambda.1se), lty=3)
  invisible()
} 

#' Compute Coefficients from a "cv.dcsvm" Object
#'
#' Computes coefficients at chosen values of \code{lambda} from the \code{\link{cv.dcsvm}} object.
#'
#' @name coef.cv.dcsvm
#' @aliases coef.cv.dcsvm
#' @title Compute Coefficients from a "cv.dcsvm" Object
#'
#' @description
#' Computes the coefficients at specified \code{lambda} values for a \code{cv.dcsvm} object.
#'
#' @usage
#' \method{coef}{cv.dcsvm}(object, s = c("lambda.1se", "lambda.min"), ...)
#'
#' @param object A fitted \code{\link{cv.dcsvm}} object, obtained by conducting cross-validation on the sparse density-convoluted SVM model.
#' @param s Value(s) of the L1 tuning parameter \code{lambda} for computing coefficients. Default is \code{"lambda.1se"}, the largest \code{lambda} value achieving a cross-validation error within one standard error of the minimum. Alternatively, \code{"lambda.min"} corresponds to the \code{lambda} incurring the least cross-validation error. \code{s} can also be numeric, specifying the value(s) to use.
#' @param ... Other arguments that can be passed to \code{\link{dcsvm}}.
#'
#' @details
#' This function computes the coefficients for \code{lambda} values suggested by cross-validation.
#'
#' @return
#' The returned object depends on the choice of \code{s} and any additional arguments passed to the \code{\link{dcsvm}} method.
#'
#' @seealso
#' \code{\link{cv.dcsvm}} and \code{\link{predict.cv.dcsvm}} methods.
#'
#' @examples
#' data(colon)
#' colon$x <- colon$x[ ,1:100] # Use only the first 100 columns for this example
#' set.seed(1)
#' cv <- cv.dcsvm(colon$x, colon$y, lam2=1, nfolds=5)
#' c1 <- coef(cv, s="lambda.1se")
#'
#' @export
coef.cv.dcsvm <- function(object, 
    s=c("lambda.1se", "lambda.min"), ...) {
  if (is.numeric(s)) 
    lambda <- s else if (is.character(s)) {
      s <- match.arg(s)
      lambda <- object[[s]]
    } else stop("Invalid form for s.")
  coef(object$dcsvm.fit, s=lambda, ...)
}
Any scripts or data that you put into this service are public.
dcsvm documentation built on April 3, 2025, 10:27 p.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
dcsvm
Density Convoluted Support Vector Machines

R/cv.dcsvm.R
In dcsvm: Density Convoluted Support Vector Machines

Defines functions coef.cv.dcsvm plot.cv.dcsvm predict.cv.dcsvm cv.epanechnikov cv.uniform cv.gaussian cv.dcsvm

Documented in coef.cv.dcsvm cv.dcsvm cv.epanechnikov cv.gaussian cv.uniform plot.cv.dcsvm predict.cv.dcsvm

Try the dcsvm package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

dcsvm Density Convoluted Support Vector Machines

R/cv.dcsvm.R In dcsvm: Density Convoluted Support Vector Machines

Defines functions coef.cv.dcsvm plot.cv.dcsvm predict.cv.dcsvm cv.epanechnikov cv.uniform cv.gaussian cv.dcsvm

Documented in coef.cv.dcsvm cv.dcsvm cv.epanechnikov cv.gaussian cv.uniform plot.cv.dcsvm predict.cv.dcsvm

Try the dcsvm package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

dcsvm
Density Convoluted Support Vector Machines

R/cv.dcsvm.R
In dcsvm: Density Convoluted Support Vector Machines