predict.naive_bayes: Predict Method for naive_bayes Objects

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Classification based on Naive Bayes models.

Usage

1
2
3
## S3 method for class 'naive_bayes'
predict(object, newdata = NULL, type = c("class","prob"),
  threshold = 0.001, eps = 0, ...)

Arguments

object

object of class inheriting from "naive_bayes".

newdata

matrix or dataframe with categorical (character/factor/logical) or metric (numeric) predictors.

type

if "class", new data points are classified according to the highest posterior probabilities. If "prob", the posterior probabilities for each class are returned.

threshold

value by which zero probabilities or probabilities within the epsilon-range corresponding to metric variables are replaced (zero probabilities corresponding to categorical variables can be handled with Laplace (additive) smoothing).

eps

value that specifies an epsilon-range to replace zero or close to zero probabilities by threshold. It applies to metric variables.

...

not used.

Details

Computes conditional posterior probabilities for each class label using the Bayes' rule under the assumption of independence of predictors. If no new data is provided, the data from the object is used. Logical variables are treated as categorical (binary) variables. Predictors with missing values are not included into the computation of posterior probabilities.

Value

predict.naive_bayes returns either a factor with class labels corresponding to the maximal conditional posterior probabilities or a matrix with class label specific conditional posterior probabilities.

Author(s)

Michal Majka, michalmajka@hotmail.com

See Also

naive_bayes, plot.naive_bayes, tables, get_cond_dist, %class%

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
### Simulate example data
n <- 100
set.seed(1)
data <- data.frame(class = sample(c("classA", "classB"), n, TRUE),
                   bern = sample(LETTERS[1:2], n, TRUE),
                   cat  = sample(letters[1:3], n, TRUE),
                   logical = sample(c(TRUE,FALSE), n, TRUE),
                   norm = rnorm(n),
                   count = rpois(n, lambda = c(5,15)))
train <- data[1:95, ]
test <- data[96:100, -1]

### Fit the model with default settings
nb <- naive_bayes(class ~ ., train)

# Classification
predict(nb, test, type = "class")
nb %class% test

# Posterior probabilities
predict(nb, test, type = "prob")
nb %prob% test


## Not run: 
vars <- 10
rows <- 1000000
y <- sample(c("a", "b"), rows, TRUE)

# Only categorical variables
X1 <- as.data.frame(matrix(sample(letters[5:9], vars * rows, TRUE),
                           ncol = vars))
nb_cat <- naive_bayes(x = X1, y = y)
nb_cat
system.time(pred2 <- predict(nb_cat, X1))

## End(Not run)

Example output

naivebayes 0.9.7 loaded
[1] classA classB classA classA classA
Levels: classA classB
[1] classA classB classA classA classA
Levels: classA classB
        classA    classB
[1,] 0.7174638 0.2825362
[2,] 0.2599418 0.7400582
[3,] 0.6341795 0.3658205
[4,] 0.5365311 0.4634689
[5,] 0.7186026 0.2813974
        classA    classB
[1,] 0.7174638 0.2825362
[2,] 0.2599418 0.7400582
[3,] 0.6341795 0.3658205
[4,] 0.5365311 0.4634689
[5,] 0.7186026 0.2813974

================================== Naive Bayes ================================== 
 
 Call: 
naive_bayes.default(x = X1, y = y)

--------------------------------------------------------------------------------- 
 
Laplace smoothing: 0

--------------------------------------------------------------------------------- 
 
 A priori probabilities: 

       a        b 
0.500527 0.499473 

--------------------------------------------------------------------------------- 
 
 Tables: 

--------------------------------------------------------------------------------- 
 ::: V1 (Categorical) 
--------------------------------------------------------------------------------- 
   
V1          a         b
  e 0.1993379 0.1995603
  f 0.2002449 0.2003251
  g 0.2006285 0.2000729
  h 0.1996955 0.2007716
  i 0.2000931 0.1992700

--------------------------------------------------------------------------------- 
 ::: V2 (Categorical) 
--------------------------------------------------------------------------------- 
   
V2          a         b
  e 0.1997235 0.1997185
  f 0.1997035 0.2001029
  g 0.2004967 0.1990398
  h 0.1995617 0.2008557
  i 0.2005147 0.2002831

--------------------------------------------------------------------------------- 
 ::: V3 (Categorical) 
--------------------------------------------------------------------------------- 
   
V3          a         b
  e 0.2006425 0.2000048
  f 0.1989223 0.2002330
  g 0.1998813 0.2005294
  h 0.1992400 0.1997405
  i 0.2013138 0.1994923

--------------------------------------------------------------------------------- 
 ::: V4 (Categorical) 
--------------------------------------------------------------------------------- 
   
V4          a         b
  e 0.2000551 0.1995784
  f 0.2000292 0.2006335
  g 0.2000012 0.2005494
  h 0.2000152 0.1987134
  i 0.1998993 0.2005254

--------------------------------------------------------------------------------- 
 ::: V5 (Categorical) 
--------------------------------------------------------------------------------- 
   
V5          a         b
  e 0.1998474 0.2001389
  f 0.2004767 0.2000188
  g 0.1999333 0.1998366
  h 0.1997075 0.2010039
  i 0.2000352 0.1990017

---------------------------------------------------------------------------------

# ... and 5 more tables

---------------------------------------------------------------------------------

   user  system elapsed 
  0.422   0.096   0.529 

naivebayes documentation built on March 13, 2020, 1:31 a.m.