Predict Method for Bayesian Kernel Projection Classifier

Description

This function predicts values based upon a model trained by bkpc for new input data. BKPC employs a Gibbs sampler to sample from the posterior distributions of the parameters, so sample probability distributions of prediction can be obtained for for new data points.

Usage

1
2
## S3 method for class 'bkpc'
predict(object, newdata  = NULL, n.burnin = 0, ...)

Arguments

object

a bkpc object.

newdata

a matrix containing the new input data

n.burnin

number of burn-in iterations to discard.

...

Currently not used.

Details

If newdata is omitted the predictions are based on the data used for the fit.

Value

A list with the following components:

class

estimated class for each observation in the input data.

map

maximum a posteriori probability estimate for belonging to each class for all observations in the input data.

p

a matrix of samples of estimated probabilities for belonging to each class for each observation in the input data.

Author(s)

K. Domijan

References

Domijan K. and Wilson S. P.: Bayesian kernel projections for classification of high dimensional data. Statistics and Computing, 2011, Volume 21, Issue 2, pp 203-216

See Also

bkpc

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
set.seed(-88106935)

data(iris)
testset <- sample(1:150,30)

train <- as.matrix(iris[-testset,-5])
test <- as.matrix(iris[testset,-5])

wtr <- iris[-testset, 5]
wte <- iris[testset, 5]

result <- bkpc(train, y = wtr,  n.iter = 1000,  thin = 10, n.kpc = 2, 
intercept = FALSE, rotate = TRUE)

# predict
out <- predict(result, test, n.burnin = 20) 

# classification rate for the test set

sum(out$class == as.numeric(wte))/dim(test)[1]

table(out$class, as.numeric(wte))

# consider just misclassified observations:

missclassified <- out$class != as.numeric(wte)


tab <- cbind(out$map[missclassified, ], out$class[missclassified],  as.numeric(wte)[missclassified])
colnames(tab) = c("P(k = 1)", "P(k = 2)", "P(k = 3)", "predicted class", "true class")
tab


# consider, say, 28th observation in the test set:
# sample probability distributions of belonging to each of the three classes: 


ProbClass2samples <- out$p[28, ]
ProbClass3samples <- out$p[28 + dim(test)[1], ]
ProbClass1samples <- 1 - (ProbClass2samples + ProbClass3samples)
hist(ProbClass1samples)
hist(ProbClass2samples)
hist(ProbClass3samples)