The naivebayes
package offers a comprehensive set of functions that implement specialized versions of the Naïve Bayes classifier. In this article, we explore these functions and provide a detailed overview of their basic usage.
The package provides five key functions that cater to different types of data:
bernoulli_naive_bayes()
: Ideal for binary data, this function is specifically tailored to handle situations where the features are binary in nature, taking values of either 0 or 1. It effectively models the relationship between these binary features and the class labels.
multinomial_naive_bayes()
: This function is specifically designed for multinomial data. It is well-suited for cases where the features are discrete and have multiple categories. For example, it is commonly used in text classification tasks, where word counts are utilized as features.
poisson_naive_bayes()
: This function is specifically designed for count data. It is well-suited for scenarios where the features represent counts or frequencies. By assuming a Poisson distribution for the features, this function effectively models the relationship between the counts and the class labels.
gaussian_naive_bayes()
: Suitable for continuous data, this function assumes that the features follow a Gaussian (normal) distribution. It is commonly used when dealing with continuous numerical features, and it models the relationship between these features and the class labels.
nonparametric_naive_bayes()
This function introduces a nonparametric variant of the Naïve Bayes classifier. Unlike the specialized versions mentioned above, it does not make any assumptions regarding the specific distribution of the continuous features. Instead, it employs kernel density estimation to estimate the class conditional probabilities and feature distributions in a nonparametric manner. This allows for more flexible modeling and can be particularly useful when dealing with data that deviates from typical parametric assumptions.
Throughout this article, we will dive into each of these functions providing illustrative examples to demonstrate their practical usage. By the end, you will have a solid understanding of how to leverage these specialized versions of the Naïve Bayes classifier to tackle various types of data and classification problems.
We will also show equivalent calculations using the general naive_bayes()
function, allowing you to compare and understand the differences between the specialized and general approaches.
The Bernoulli Naive Bayes classifier is suitable for binary data. To demonstrate its usage, we start by simulating a dataset with binary features. We then train the Bernoulli Naive Bayes model using the bernoulli_naive_bayes()
function.
Simulate data:
library(naivebayes) cols <- 10 ; rows <- 100 ; probs <- c("0" = 0.4, "1" = 0.1) M <- matrix(sample(0:1, rows * cols, TRUE, probs), nrow = rows, ncol = cols) y <- factor(sample(paste0("class", LETTERS[1:2]), rows, TRUE, prob = c(0.3, 0.7))) colnames(M) <- paste0("V", seq_len(ncol(M)))
Train the Bernoulli Naive Bayes model:
laplace <- 0.5 bnb <- bernoulli_naive_bayes(x = M, y = y, laplace = laplace) # M has to be a matrix summary(bnb) head(predict(bnb, newdata = M, type = "prob")) # Equivalently head(bnb %prob% M) # Visualise marginal distributions plot(bnb, which = "V1", prob = "marginal") # Obtain model coefficients coef(bnb)
Equivalent calculation with naive_bayes function:
# It is made sure that the columns are factors with the 0-1 levels) df <- as.data.frame(lapply(as.data.frame(M), factor, levels = c(0, 1))) # sapply(df, class) nb <- naive_bayes(df, y, laplace = laplace) head(nb %prob% df)
Next, we explore the Multinomial Naive Bayes classifier. We simulate a dataset with multinomial features and train the model using the multinomial_naive_bayes()
function. We discuss the summary of the model, perform classification, compute posterior probabilities, and examine parameter estimates. Note that this specialized model is not available in the general naive_bayes()
function.
Simulate data:
set.seed(1) cols <- 3 # words rows <- 10000 # all documents rows_spam <- 100 # spam documents word_prob_non_spam <- prop.table(runif(cols)) word_prob_spam <- prop.table(runif(cols)) M1 <- t(rmultinom(rows_spam, size = cols, prob = word_prob_spam)) M2 <- t(rmultinom(rows - rows_spam, size = cols, prob = word_prob_non_spam)) M <- rbind(M1, M2) colnames(M) <- paste0("word", 1:cols) ; rownames(M) <- paste0("doc", 1:rows) head(M) y <- c(rep("spam", rows_spam), rep("non-spam", rows - rows_spam))
Train the Multinomial Naive Bayes:
laplace <- 0.5 mnb <- multinomial_naive_bayes(x = M, y = y, laplace = laplace) summary(mnb) # Classification head(predict(mnb, M)) # head(mnb %class% M) # Posterior probabilities head(predict(mnb, M, type = "prob")) # head(mnb %prob% M) # Parameter estimates coef(mnb) # Compare estimates to the true probabilities round(cbind(non_spam = word_prob_non_spam, spam = word_prob_spam), 4)
The Poisson Naive Bayes classifier is specifically designed for count data. We simulate count data and train the model using the poisson_naive_bayes()
function. We analyze the summary of the model, perform prediction, visualize marginal distributions, and obtain model coefficients.
Simulate data:
cols <- 10 ; rows <- 100 M <- matrix(rpois(rows * cols, lambda = 3), nrow = rows, ncol = cols) # is.integer(M) # [1] TRUE y <- factor(sample(paste0("class", LETTERS[1:2]), rows, TRUE)) colnames(M) <- paste0("V", seq_len(ncol(M)))
Train the Poisson Naive Bayes:
laplace <- 0 pnb <- poisson_naive_bayes(x = M, y = y, laplace = laplace) summary(pnb) head(predict(pnb, newdata = M, type = "prob")) # Visualise marginal distributions plot(pnb, which = "V1", prob = "marginal") # Obtain model coefficients coef(pnb)
Equivalent calculation with naive_bayes function:
nb2 <- naive_bayes(M, y, usepoisson = TRUE, laplace = laplace) head(predict(nb2, type = "prob"))
The Gaussian Naive Bayes classifier is discussed next. We use the famous Iris
^[https://en.wikipedia.org/wiki/Iris_flower_data_set] dataset and train the Gaussian Naive Bayes model using the gaussian_naive_bayes()
function. We summarize the model, visualize class conditional distributions, and obtain parameter estimates.
Data:
data(iris) y <- iris[[5]] M <- as.matrix(iris[-5])
Train the Gaussian Naive Bayes:
gnb <- gaussian_naive_bayes(x = M, y = y) summary(gnb) head(predict(gnb, newdata = M, type = "prob")) # Visualise class conditional distributions plot(gnb, which = "Sepal.Width", prob = "conditional") # Obtain parameter estimates coef(gnb) coef(gnb)[c(TRUE,FALSE)] # Only means
Equivalent calculation with general naive_bayes function:
nb3 <- naive_bayes(M, y) head(predict(nb3, newdata = M, type = "prob"))
Lastly, we explore the Non-Parametric Naive Bayes classifier. Again, using the Iris
^[https://en.wikipedia.org/wiki/Iris_flower_data_set] dataset, we train the model using the nonparametric_naive_bayes()
function. We will visualize the class conditional distributions, perform classification, so that it can be compared to the Gaussian Naive Bayes demonstrated in a previous section.
Data:
data(iris) y <- iris[[5]] M <- as.matrix(iris[-5])
Train the Non-Parametric Naive Bayes:
nnb <- nonparametric_naive_bayes(x = M, y = y) # summary(nnb) head(predict(nnb, newdata = M, type = "prob")) plot(nnb, which = "Sepal.Width", prob = "conditional")
Equivalent calculation with general naive_bayes function:
nb4 <- naive_bayes(M, y, usekernel = TRUE) head(predict(nb4, newdata = M, type = "prob"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.