knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

BIOS625: Naive Bayes Classifier with Discretization and Gaussian Estimation

Build Status R-CMD-check codecov

The usage of myNBpackage is to allow users to build naive bayes classifier with discretization and Gaussian estimation and predict the new data's labels. To estimate the parameters for a feature's distribution, one must assume a distribution or generate nonparametric models for the features from the training set. If you are dealing with continuous data, a common assumption is that these continuous values are Gaussians. It's also OK to do with Poisson, Multinomial or Bernoulli distribution. Another commonly used technique for dealing with continuous numerical problems is by discretizing continuous numerical values. Generally, when the number of training samples is small or the exact distribution is known, the method of passing the probability distribution is a better choice. In the case of a large number of samples, the discretization method performs better, because a large number of samples can learn the distribution of the data. In my package, I build method for data discretization and build naive bayes classifier with Gaussian estimation for continous variables.

Compared to the well-established e1071 package, although our package is not so efficient and memory-saving when running the same huge data tasks, it could provide users with more flexible operation with continuous and categorical variables. We provide the tutorial and description to help users better access and understand the functionalities of these methods.

Structure

This package includes 5 functions:

Installation

You can install the development version of myNBpackage like so:

devtools::install_github('sharechanxd/myNBpackage', build_vignettes = T)
library("myNBpackage")

Example

These are basic example which shows you how to solve a common problem and illustrate the usage of this function in the package:

library(myNBpackage)
data("iris")

# Basic example
x=iris[c(1:40,51:90,101:140),-5]
y=iris[c(1:40,51:90,101:140),5]
testx = iris[c(41:50,91:100,141:150),-5]
m2 = myNaiveBayes(x,y)
r1 = predict_your_model(m2,testx,'class')
r2 = predict_your_model(m2,testx,'raw')

# discrete functions
v = disc_train_data(x,y)
x_dis = v$discredata
testx_dis = disc_test_data(testx,v$cutp)

# discrete example
m2_2 = myNaiveBayes(x,y,discre = TRUE)
r2_2 = predict_your_model(m2_2,testx,'class')

For more detailed examples or for more information, please use

browseVignettes(package = 'myNBpackage')

and click HTML to see more complex examples and how to use these functions in a more complete way.



sharechanxd/myNBpackage documentation built on Dec. 23, 2021, 1:21 a.m.