Train_naiveBayes_multicore: Train_naiveBayes_multicore

View source: R/Train_naiveBayes_multicore.R

Train_naiveBayes_multicoreR Documentation

Train_naiveBayes_multicore

Description

Trains a Pareto Density estimated naive Bayes model (PDENB) with multicore parallelity

Usage

Train_naiveBayes_multicore(cl=NULL,Data,Cls,Predict=FALSE,Priors,UseMemshare=FALSE,...)

Arguments

cl

Object instance of package parallel.

Data

[1:n,1:d] matrix of training data. It consists of n cases of d-dimensional data points. Every case hasd attributes, variables or features.

Cls

[1:n] numerical vector with n numbers defining the classification. It has k unique numbers representing the arbitrary labels of the classification.

Predict

Optional, boolean to decide extent of output. In case of TRUE, yields ClsTrain and Posteriors, else it yields only Model and Thetas. Note: Only if Predict is set to TRUE, parameter EvalPlausible can be set true!

Priors

Optional, [1:k] numerical vector defining the prior probabilities of the k classes. If missing, estimated from Cls.

UseMemshare

Optional boolean. If set to TRUE, then package functionality from Memshare is used, else classic library parallel is used.

...

Gaussian: Optional: Default=TRUE). Assume gaussian distribution.

Plausible: (Optional: TRUE: uses plausble bayesian theorem, FALSE non-plausible bayesian theorem

Type: (Optional: default=1, 1 = original PDE, 2 = R native density estimation

Threshold: Threshold for which the standard deviation cannot be smaller (default =1e-12)

PlotIt: Optional: Default=FALSE, TRUE: Plots Likelihoods

PlotCutOff: Optional: Scalar indicating how many features (starting from 1) should be plotted, or a numerical vector specifying the indices of the features to plot. Note: In the second case, avoid selecting too many features, as this may cause the plot to fail

ParetoRadiusPerFeauture: Optional [1:d] numerical vector for pareto radii computed priorly, see ParetoRadius or {ParetoRadius_fast}

cl: Optional: a cluster object, created by parallel, if given and ParetoRadiusPerFeauture missing, then ParetoRadiusPerFeauture is compputed multicore otherwise single core

Robust: Optional: Default=FALSE, TRUE: robust estimation of mean and std in case of Gaussian=TRUE

GlobalPR: Optional: Default=TRUE, FALSE: estimation of pareto radius for each class individually.

Details

Precomputation of ParetoRadiusPerFeauture can be usefull to make cross-validation faster although it should be only done on the training data.

If Plausible is not given, both options are evalauted using shannon information.

c_Kernels_list and ListOfLikelihoods have d elements each storing a matrix [1:m,1:k], usually m!=n. In contrast to DataLikelihoodsPerClass in which by interpolation the matrix are of size [1:n,1:k]

Value

Model

List of model parameters and results.

c_Kernels_list

List of matrices, where each matrix represent the kernels of one feature for all classes.

ListOfLikelihoods

List of matrices, where each matrix represent the likelihood of one feature for all classes.

PDFs_funs

Nested list of depth 1, where the first index assigns the feature index and the second index assigns the class. The elements are functions for the density estimation for each feature and each class.

ParetoRadiusPerFeauture

Numeric vector which stores the pareto radius for each feature.

Theta

Parameters mean and standard deviation of the Gaussian distributions per class and feaures.

Priors

Numeric vector which stores the prior probability of each class to appear.

PlausibleCenters

[1:k, 1:f] Numeric matrix which stores the centers for each feature and each class, where the row index assigns features and the column index assigns classes.

ClsTrain

[1:n] numerical vector with n numbers defining the classification. It has k unique numbers representing the arbitrary labels of the classification.

Posteriors

[1:n, 1:k] Numeric matrices with posterior probabilities.

Author(s)

Michael Thrun

See Also

Predict_naiveBayes

Examples

if(requireNamespace("FCPS")){
data(Hepta)
Data=Hepta$Data
Cls=Hepta$Cls

#non-parametric
V=Train_naiveBayes_multicore(cl=NULL,Data=Data,Cls=Cls,Gaussian=FALSE,Predict=TRUE)
ClsTrain=V$ClsTrain
table(Cls,ClsTrain)
}

PDEnaiveBayes documentation built on Nov. 17, 2025, 5:07 p.m.