| multiclass routines | R Documentation |
Tools for multiclass classification, parametric and nonparametric.
avalogtrn(trnxy,yname) ovaknntrn(trnxy,yname,k,xval=FALSE) avalogpred() classadjust(econdprobs,wrongprob1,trueprob1) boundaryplot(y01,x,regests,pairs=combn(ncol(x),2),pchvals=2+y01,cex=0.5,band=0.10)
pchvals |
Point size in base-R graphics. |
trnxy |
Data matrix, Y last. |
xval |
If TRUE, use leaving-one-out method. |
y01 |
Y vector (1s and 0s). |
regests |
Estimated regression function values. |
x |
X data frame or matrix. |
pairs |
Two-row matrix, column i of which is a pair of predictor variables to graph. |
cex |
Symbol size for plotting. |
band |
If |
yname |
Name of the Y column. |
k |
Number of nearest neighbors. |
econdprobs |
Estimated conditional class probabilities, given the predictors. |
wrongprob1 |
Incorrect, data-provenanced, unconditional P(Y = 1). |
trueprob1 |
Correct unconditional P(Y = 1). |
These functions aid classification in the multiclass setting.
The function boundaryplot serves as a visualization technique,
for the two-class setting. It draws the boundary between predicted Y =
1 and predicted Y = 0 data points in 2-dimensional feature space, as
determined by the argument regests. Used to visually assess
goodness of fit, typically running this function twice, say one for
glm then for kNN. If there is much discrepancy and the
analyst wishes to still use glm(), he/she may wish to add polynomial
terms.
The functions not listed above are largely deprecated, e.g. in favor of
qeLogit and the other qe-series functions.
Norm Matloff
## Not run: data(oliveoils) oo <- oliveoils[,-1] # toy example set.seed(9999) x <- runif(25) y <- sample(0:2,25,replace=TRUE) xd <- preprocessx(x,2,xval=FALSE) kout <- ovaknntrn(y,xd,m=3,k=2) kout$regest # row 2: 0.0,0.5,0.5 predict(kout,predpts=matrix(c(0.81,0.55,0.15),ncol=1)) # 0,2,0or2 yd <- factorToDummies(as.factor(y),'y',FALSE) kNN(x,yd,c(0.81,0.55,0.15),2) # predicts 0, 1or2, 2 data(peDumms) # prog/engr data ped <- peDumms[,-33] ped <- as.matrix(ped) x <- ped[,-(23:28)] y <- ped[,23:28] knnout <- kNN(x,y,x,25,leave1out=TRUE) truey <- apply(y,1,which.max) - 1 mean(knnout$ypreds == truey) # about 0.37 xd <- preprocessx(x,25,xval=TRUE) kout <- knnest(y,xd,25) preds <- predict(kout,predpts=x) hats <- apply(preds,1,which.max) - 1 mean(yhats == truey) # about 0.37 data(peFactors) # discard the lower educ-level cases, which are rare edu <- peFactors$educ numedu <- as.numeric(edu) idxs <- numedu >= 12 pef <- peFactors[idxs,] numedu <- numedu[idxs] pef$educ <- as.factor(numedu) pef1 <- pef[,c(1,3,5,7:9)] # ovalog ovaout <- ovalogtrn(pef1,"occ") preds <- predict(ovaout,predpts=pef1[,-3]) mean(preds == factorTo012etc(pef1$occ)) # about 0.39 # avalog avaout <- avalogtrn(pef1,"occ") preds <- predict(avaout,predpts=pef1[,-3]) mean(preds == factorTo012etc(pef1$occ)) # about 0.39 # knn knnout <- ovalogtrn(pef1,"occ",25) preds <- predict(knnout,predpts=pef1[,-3]) mean(preds == factorTo012etc(pef1$occ)) # about 0.43 data(oliveoils) oo <- oliveoils oo <- oo[,-1] knnout <- ovaknntrn(oo,'Region',10) # predict a new case that is like oo1[1,] but with palmitic = 950 newx <- oo[1,2:9,drop=FALSE] newx[,1] <- 950 predict(knnout,predpts=newx) # predicts class 2, South ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.