dcsvm: Density-Convoluted Support Vector Machine

View source: R/dcsvm.R

dcsvmR Documentation

Density-Convoluted Support Vector Machine

Description

Fits the density-convoluted support vector machine (DCSVM) through kernel density convolutions.

Usage

dcsvm(
  x,
  y,
  nlambda = 100,
  lambda.factor = ifelse(nobs < nvars, 0.01, 1e-04),
  lambda = NULL,
  lam2 = 0,
  kern = c("gaussian", "uniform", "epanechnikov"),
  hval = 1,
  pf = rep(1, nvars),
  pf2 = rep(1, nvars),
  exclude,
  dfmax = nvars + 1,
  pmax = min(dfmax * 1.2, nvars),
  standardize = TRUE,
  eps = 1e-08,
  maxit = 1e+06,
  istrong = TRUE
)

Arguments

x

A numeric matrix with N rows and p columns representing predictors. Each row corresponds to an observation, and each column corresponds to a variable.

y

A numeric vector of length N representing binary responses. Elements must be either -1 or 1.

nlambda

Number of lambda values in the sequence. Default is 100.

lambda.factor

Ratio of the smallest to the largest lambda in the sequence: lambda.factor = min(lambda) / max(lambda). The default value is 0.0001 if N >= p or 0.01 if N < p. Takes no effect if a lambda sequence is specified.

lambda

An optional user-specified sequence of lambda values. If lambda = NULL (default), the sequence is computed based on nlambda and lambda.factor. The program automatically sorts user-defined lambda sequences in decreasing order.

lam2

Users may tune \lambda_2, which controls the L2 regularization strength. Default is 0 (lasso).

kern

Type of kernel method for smoothing. Options are "gaussian", "uniform", and "epanechnikov". Default is "epanechnikov".

hval

The bandwidth parameter for kernel smoothing. Default is 1.

pf

A numeric vector of length p representing the L1 penalty weights for each coefficient. A common choice is (\beta + 1/n)^{-1}, where n is the sample size and \beta is obtained from L1 DCSVM or enet DCSVM. Default is 1 for all predictors.

pf2

A numeric vector of length p representing the L2 penalty weights for each coefficient. A value of 0 indicates no L2 shrinkage. Default is 1 for all predictors.

exclude

Indices of predictors to exclude from the model. Equivalent to assigning an infinite penalty factor. Default is none.

dfmax

Maximum number of nonzero coefficients allowed in the model. Default is p + 1. Useful for large p when a partial path is acceptable.

pmax

Maximum number of variables allowed to ever be nonzero during the computation. Default is min(dfmax * 1.2, p).

standardize

Logical indicating whether predictors should be standardized to unit variance. Default is TRUE. Note that predictors are always centered.

eps

Convergence threshold. The algorithm stops when 4\max_j(\beta_j^{new} - \beta_j^{old})^2 is less than eps. Default is 1e-8.

maxit

Maximum number of iterations allowed. Default is 1e6. Consider increasing maxit if the algorithm does not converge.

istrong

Logical indicating whether to use the strong rule for faster computation. Default is TRUE.

Value

An object of class dcsvm containing the following components:

b0

Intercept values for each lambda.

beta

Sparse matrix of coefficients for each lambda. Use as.matrix() to convert.

df

Number of nonzero coefficients for each lambda.

dim

Dimensions of the coefficient matrix.

lambda

Sequence of lambda values used.

npasses

Total number of iterations across all lambda values.

jerr

Warnings and errors. 0 if no errors.

call

The matched call.

See Also

print.dcsvm, predict.dcsvm, coef.dcsvm, plot.dcsvm, and cv.dcsvm.

Examples

# Load the data
data(colon)
# Fit the elastic-net penalized DCSVM with lambda2 to be 1
fit <- dcsvm(colon$x, colon$y, lam2 = 1)
print(fit)
# Coefficients at some lambda value
c1 <- coef(fit, s = 0.005)
# Make predictions
predict(fit, newx = colon$x[1:10, ], s = c(0.01, 0.005))


dcsvm documentation built on April 3, 2025, 10:27 p.m.