multinomial_naive_bayes: Multinomial Naive Bayes Classifier

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

multinomial_naive_bayes is used to fit the Multinomial Naive Bayes model.

Usage

1
multinomial_naive_bayes(x, y, prior = NULL, laplace = 0.5, ...)

Arguments

x

numeric matrix with integer predictors (matrix or dgCMatrix from Matrix package).

y

class vector (character/factor/logical).

prior

vector with prior probabilities of the classes. If unspecified, the class proportions for the training set are used. If present, the probabilities should be specified in the order of the factor levels.

laplace

value used for Laplace smoothing (additive smoothing). Defaults to 0.5.

...

not used.

Details

This is a specialized version of the Naive Bayes classifier, where the features represent frequencies generated by a multinomial distribution.

Sparse matrices of class "dgCMatrix" (Matrix package) are supported in order to speed up calculation times.

Please note that the Multinomial Naive Bayes is not available through the naive_bayes function.

Value

multinomial_naive_bayes returns an object of class "multinomial_naive_bayes" which is a list with following components:

data

list with two components: x (matrix with predictors) and y (class variable).

levels

character vector with values of the class variable.

laplace

amount of Laplace smoothing (additive smoothing).

params

matrix with class conditional parameter estimates.

prior

numeric vector with prior probabilities.

call

the call that produced this object.

Author(s)

Michal Majka, michalmajka@hotmail.com

References

McCallum, Andrew; Nigam, Kamal (1998). A comparison of event models for Naive Bayes text classification (PDF). AAAI-98 workshop on learning for text categorization. 752. http://www.cs.cmu.edu/~knigam/papers/multinomial-aaaiws98.pdf

See Also

predict.multinomial_naive_bayes, tables, get_cond_dist, %class%, coef.multinomial_naive_bayes

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# library(naivebayes)

### Simulate the data:
cols <- 10 ; rows <- 100
M <- matrix(sample(0:5, rows * cols, TRUE, prob = c(0.95, rep(0.01, 5))), nrow = rows, ncol = cols)
y <- factor(sample(paste0("class", LETTERS[1:2]), rows, TRUE, prob = c(0.3,0.7)))
colnames(M) <- paste0("V", seq_len(ncol(M)))
laplace <- 1

### Train the Multinomial Naive Bayes
mnb <- multinomial_naive_bayes(x = M, y = y, laplace = laplace)
summary(mnb)

# Classification
head(predict(mnb, newdata = M, type = "class")) # head(mnb %class% M)

# Posterior probabilities
head(predict(mnb, newdata = M, type = "prob")) # head(mnb %prob% M)

# Parameter estimates
coef(mnb)


### Sparse data: train the Multinomial Naive Bayes
library(Matrix)
M_sparse <- Matrix(M, sparse = TRUE)
class(M_sparse) # dgCMatrix

# Fit the model with sparse data
mnb_sparse <- multinomial_naive_bayes(M_sparse, y, laplace = laplace)

# Classification
head(predict(mnb_sparse, newdata = M_sparse, type = "class"))

# Posterior probabilities
head(predict(mnb_sparse, newdata = M_sparse, type = "prob"))

# Parameter estimates
coef(mnb_sparse)

naivebayes documentation built on March 13, 2020, 1:31 a.m.