bkpc: Bayesian Kernel Projection Classifier

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Description

Function bkpc is used to train a Bayesian kernel projection classifier. This is a nonlinear multicategory classifier which performs the classification of the projections of the data to the principal axes of the feature space. The Gibbs sampler is implemented to find the posterior distributions of the parameters, so probability distributions of prediction can be obtained for for new observations.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Default S3 method:
bkpc(x, y, theta = NULL, n.kpc = NULL, thin = 100, n.iter = 1e+05, std = 10, 
g1 = 0.001, g2 = 0.001, g3 = 1, g4 = 1, initSigmasq = NULL, initBeta = NULL,
initTau = NULL, intercept = TRUE, rotate = TRUE, ...)


## S3 method for class 'kern'
bkpc(x, y, n.kpc = NULL, thin = 100, n.iter = 1e+05, std = 10, 
g1 = 0.001, g2 = 0.001, g3 = 1, g4 = 1, initSigmasq = NULL, initBeta = NULL, 
initTau = NULL, intercept = TRUE, rotate = TRUE, ...)


## S3 method for class 'kernelMatrix'
bkpc(x, y, n.kpc = NULL, thin = 100, n.iter = 1e+05, std = 10, 
g1 = 0.001, g2 = 0.001, g3 = 1, g4 = 1, initSigmasq = NULL, initBeta = NULL, 
initTau = NULL, intercept = TRUE, rotate = TRUE, ...)

Arguments

x

either: a data matrix, a kernel matrix of class "kernelMatrix" or a kernel matrix of class "kern".

y

a response vector with one label for each row of x. Should be a factor.

theta

the inverse kernel bandwidth parameter.

n.kpc

number of kernel principal components to use.

n.iter

number of iterations for the MCMC algorithm.

thin

thinning interval.

std

standard deviation parameter for the random walk proposal.

g1

γ_1 hyper-parameter of the prior inverse gamma distribution for the σ^2 parameter in the BKPC model.

g2

γ_2 hyper-parameter of the prior inverse gamma distribution for the σ^2 parameter of the BKPC model.

g3

γ_3 hyper-parameter of the prior gamma distribution for the τ parameter in the BKPC model.

g4

γ_4 hyper-parameter of the prior gamma distribution for the τ parameter in the BKPC model.

initSigmasq

optional specification of initial value for the σ^2 parameter in the BKPC model.

initBeta

optional specification of initial values for the β parameters in the BKPC model.

initTau

optional specification of initial values for the τ parameters in the BKPC model.

intercept

if intercept=TRUE (the default) then include the intercept in the model.

rotate

if rotate=TRUE (the default) then run the BKPC model. Else run the BKMC model.

...

Currently not used.

Details

Initial values for a BKPC model can be supplied, otherwise they are generated using runif function.

The data can be passed to the bkpc function in a matrix and the Gaussian kernel computed using the gaussKern function is then used in training the algorithm and predicting. The bandwidth parameter theta can be supplied to the gaussKern function, else a default value is used.

In addition, bkpc also supports input in the form of a kernel matrix of class "kern" or "kernelMatrix".The latter allows for a range of kernel functions as well as user specified ones.

If rotate=TRUE (the default) then the BKPC is trained. This algorithm performs the classification of the projections of the data to the principal axes of the feature space. Else the Bayesian kernel multicategory classifier (BKMC) is trained, where the data is mapped to the feature space via the kernel matrix, but not projected (rotated) to the principal axes. The hierarchichal prior structure for the two models is the same, but BKMC model is not sparse.

Value

An object of class "bkpc" including:

beta

realizations of the β parameters from the joint posterior distribution in the BKPC model.

tau

realizations of the τ parameters from the joint posterior distribution in the BKPC model.

z

realizations of the latent variables z from the joint posterior distribution in the BKPC model.

sigmasq

realizations of the σ^2 parameter from the joint posterior distribution in the BKPC model.

n.class

number of independent classes of the response variable i.e. number of classes - 1.

n.kpc

number of kernel principal components used.

n.iter

number of iterations of the MCMC algorithm.

thin

thinning interval.

intercept

if true, intercept was included in the model.

rotate

if true, the sparse BKPC model was fitted, else BKMC model.

kPCA

if rotate=TRUE an object of class "kPCA", else NULL.

x

the supplied data matrix or kernel matrix.

theta

if data was supplied, as opposed to the kernel, this is the inverse kernel bandwidth parameter used in obtaining the Gaussian kernel, else NULL.

Note

If supplied, data are not scaled internally. If rotate=TRUE the mapping is centered internally by the kPCA function.

Author(s)

K. Domijan

References

Domijan K. and Wilson S. P.: Bayesian kernel projections for classification of high dimensional data. Statistics and Computing, 2011, Volume 21, Issue 2, pp 203-216

See Also

kPCA gaussKern predict.bkpc plot.bkpc summary.bkpc kernelMatrix (in package kernlab)

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
set.seed(-88106935)

data(microarray)

# consider only four tumour classes (NOTE: "NORM" is not a class of tumour)
y <- microarray[, 2309]
train <- as.matrix(microarray[y != "NORM", -2309])
wtr <- factor(microarray[y != "NORM", 2309], levels = c("BL" ,  "EWS" , "NB" ,"RMS" ))

n.kpc <- 6
n.class <- length(levels(wtr)) - 1

K <- gaussKern(train)$K

# supply starting values for the parameters
# use Gaussian kernel as input

result <- bkpc(K, y = wtr, n.iter = 1000,  thin = 10, n.kpc = n.kpc,  
initSigmasq = 0.001, initBeta = matrix(10, n.kpc *n.class, 1), 
initTau =matrix(10, n.kpc * n.class, 1), intercept = FALSE, rotate = TRUE)

# predict

out <- predict(result, n.burnin = 10) 

table(out$class, as.numeric(wtr))

# plot the data projection on the kernel principal components

pairs(result$kPCA$KPCs[, 1 : n.kpc], col = as.numeric(wtr), 
main =  paste("symbol = predicted class", "\n", "color = true class" ), 
pch = out$class, upper.panel = NULL)
par(xpd=TRUE)
legend('topright', levels(wtr), pch = unique(out$class), 
text.col = as.numeric(unique(wtr)), bty = "n")




# Another example: Iris data

data(iris)
testset <- sample(1:150,50)

train <- as.matrix(iris[-testset,-5])
test <- as.matrix(iris[testset,-5])

wtr <- iris[-testset, 5]
wte <- iris[testset, 5]

# use default starting values for paramteres in the model.

result <- bkpc(train, y = wtr,  n.iter = 1000,  thin = 10, n.kpc = 2, 
intercept = FALSE, rotate = TRUE)

# predict
out <- predict(result, test, n.burnin = 10) 

# classification rate
sum(out$class == as.numeric(wte))/dim(test)[1]

table(out$class, as.numeric(wte))

## Not run: 
# Another example: synthetic data from MASS library

library(MASS)

train<- as.matrix(synth.tr[, -3])
test<- as.matrix(synth.te[, -3])

wtr <- as.factor(synth.tr[, 3])
wte <- as.factor(synth.te[, 3])


#  make training set kernel using kernelMatrix from kernlab library

library(kernlab)

kfunc <- laplacedot(sigma = 1)
Ktrain <- kernelMatrix(kfunc, train)

#  make testing set kernel using kernelMatrix {kernlab}

Ktest <- kernelMatrix(kfunc, test, train)

result <- bkpc(Ktrain, y = wtr, n.iter = 1000,  thin = 10,  n.kpc = 3, 
intercept = FALSE, rotate = TRUE)

# predict

out <- predict(result, Ktest, n.burnin = 10) 

# classification rate

sum(out$class == as.numeric(wte))/dim(test)[1]
table(out$class, as.numeric(wte))


# embed data from the testing set on the new space:

KPCtest <- predict(result$kPCA, Ktest)

# new data is linearly separable in the new feature space where classification takes place
library(rgl)
plot3d(KPCtest[ , 1 :  3], col = as.numeric(wte))


# another model:  do not project the data to the principal axes of the feature space. 
# NOTE: Slow
# use Gaussian kernel with the default bandwidth parameter

Ktrain <- gaussKern(train)$K

Ktest <- gaussKern(train, test, theta = gaussKern(train)$theta)$K

resultBKMC <- bkpc(Ktrain, y = wtr, n.iter = 1000,  thin = 10,  
intercept = FALSE, rotate = FALSE)

# predict
outBKMC <- predict(resultBKMC, Ktest, n.burnin = 10)

# to compare with previous model
table(outBKMC$class, as.numeric(wte))


# another example: wine data from gclus library

library(gclus)
data(wine)

testset <- sample(1 : 178, 90)
train <- as.matrix(wine[-testset, -1])
test <- as.matrix(wine[testset, -1])

wtr <- as.factor(wine[-testset, 1])
wte <- as.factor(wine[testset, 1])

#  make training set kernel using kernelMatrix from kernlab library

kfunc <- anovadot(sigma = 1, degree = 1)
Ktrain <- kernelMatrix(kfunc, train)

#  make testing set kernel using kernelMatrix {kernlab}
Ktest <- kernelMatrix(kfunc, test, train)

result <- bkpc(Ktrain, y = wtr, n.iter = 1000,  thin = 10,  n.kpc = 3, 
intercept = FALSE, rotate = TRUE)

out <- predict(result, Ktest, n.burnin = 10) 

# classification rate in the test set
sum(out$class == as.numeric(wte))/dim(test)[1]


# embed data from the testing set on the new space:
KPCtest <- predict(result$kPCA, Ktest)

# new data is linearly separable in the new feature space where classification takes place


pairs(KPCtest[ , 1 :  3], col = as.numeric(wte), 
main =  paste("symbol = predicted class", "\n", "color = true class" ), 
pch = out$class, upper.panel = NULL)

par(xpd=TRUE)

legend('topright', levels(wte), pch = unique(out$class), 
text.col = as.numeric(unique(wte)), bty = "n")



## End(Not run)

BKPC documentation built on May 1, 2019, 9:10 p.m.