Description Usage Arguments Value Author(s) Examples
Fit a sparse linear kernel hard margin SVM comparison model to
linearly separable data. We first normalize the data using
pairs2svmData
, resulting in a scaled n x p feature
difference matrix X, with a new vector of comparisons y in
c(-1,1). We then define the linear kernel matrix K=XX' and solve
the dual quadratic program (QP) of SVM: \min_{α\in R^n}
α' K α/2 - y'α subject to for all i, y_i
α_i ≥ 0. The learned function in the scaled binary SVM
space is f(x) = b + ∑_{i\in sv} α_i k(d_i, x) where
sv are the support vectors and the bias b is calculated using the
average of b = y_i - f(d_i) over all support vectors i. The
learned ranking function in the original space is r(x) =
∑_{i\in sv} -α_i/b k(d_i, Sx) where S is the diagonal
scaling matrix of the input features. Since we use the linear
kernel k, we can also write this function as r(x) = w'x with
the weight vector w = -S/b ∑_{i\in sv} α_i d_i.
1 | hardCompareQP(Pairs, add.to.diag = 1e-10, sv.threshold = 0.001)
|
Pairs |
see |
add.to.diag |
This value is added to the diagonal of the kernel matrix, to ensure that it is positive definite. |
sv.threshold |
Optimal coefficients α_i with absolute value greater than this value are considered support vectors. |
Comparison model fit. You can do fit$rank(X) to get m numeric ranks for the rows of the m x p numeric matrix X. For two feature vectors xi and xip, we predict no significant difference if their absolute rank difference is less than 1. You can do fit$predict(Xi,Xip) to get m predicted comparisons in c(-1,0,1), for m by p numeric matrices Xi and Xip. Also, fit$sigma are the scales of the input features, fit$sv are the support vectors (in the scaled space) and fit$weight is the optimal weight vector (in the original space), and if fit$margin is positive than the data are separable.
Toby Dylan Hocking
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | library(rankSVMcompare)
data(separable, envir=environment())
sol <- hardCompareQP(separable)
## check to make sure we have perfect prediction.
y.hat <- with(separable, sol$predict(Xi, Xip))
stopifnot(separable$yi == y.hat)
## This should also be the same:
fxdiff <- with(separable, sol$rank(Xip)-sol$rank(Xi))
y.hat2 <- ifelse(fxdiff < -1, -1L,
ifelse(fxdiff > 1, 1L, 0L))
stopifnot(y.hat == y.hat2)
## difference vectors and support vectors to plot.
point.df <- with(separable, data.frame(Xip-Xi, yi))
sv.df <- with(sol$sv, data.frame(t(t(X)*sol$sigma)))
## calc svm decision boundary and margin.
mu <- sol$margin
arange <- range(point.df$angle)
seg <- function(v, line){
d <- (v-sol$weight[2]*arange)/sol$weight[1]
data.frame(t(c(distance=d, angle=arange)), line)
}
seg.df <- rbind(seg(1-mu,"margin"),
seg(1+mu,"margin"),
seg(-1-mu,"margin"),
seg(-1+mu,"margin"),
seg(1,"decision"),
seg(-1,"decision"))
library(ggplot2)
svplot <- ggplot()+
geom_point(aes(distance,angle,colour=factor(yi)), data=point.df,
size=3)+
geom_point(aes(distance,angle), data=sv.df,size=1.5)+
geom_segment(aes(distance1,angle1,xend=distance2,yend=angle2,
linetype=line),data=seg.df)+
scale_linetype_manual(values=c(decision="solid",margin="dotted"))+
ggtitle(paste("Hard margin linear kernel comparison model",
"support vectors in black",sep="\n"))
print(svplot)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.