Description Usage Arguments Details Value Examples
The naive Bayes classifier assumes independence between predictor variables conditional on the response, and a Gaussian distribution of numeric predictors with mean and standard deviation computed from the training dataset. When building a naive Bayes classifier, every row in the training dataset that contains at least one NA will be skipped completely. If the test dataset has missing values, then those predictors are omitted in the probability calculation during prediction.
1 2 3 4 5 6 7 8 9 10  h2o.naiveBayes(x, y, training_frame, model_id = NULL, nfolds = 0,
seed = 1, fold_assignment = c("AUTO", "Random", "Modulo", "Stratified"),
fold_column = NULL, keep_cross_validation_predictions = FALSE,
keep_cross_validation_fold_assignment = FALSE, validation_frame = NULL,
ignore_const_cols = TRUE, score_each_iteration = FALSE,
balance_classes = FALSE, class_sampling_factors = NULL,
max_after_balance_size = 5, max_hit_ratio_k = 0, laplace = 0,
threshold = 0.001, min_sdev = 0.001, eps = 0, eps_sdev = 0,
min_prob = 0.001, eps_prob = 0, compute_metrics = TRUE,
max_runtime_secs = 0)

x 
(Optional) A vector containing the names or indices of the predictor variables to use in building the model. If x is missing, then all columns except y are used. 
y 
The name or column index of the response variable in the data. The response must be either a numeric or a categorical/factor variable. If the response is numeric, then a regression model will be trained, otherwise it will train a classification model. 
training_frame 
Id of the training data frame. 
model_id 
Destination id for this model; autogenerated if not specified. 
nfolds 
Number of folds for Kfold crossvalidation (0 to disable or >= 2). Defaults to 0. 
seed 
Seed for random numbers (affects certain parts of the algo that are stochastic and those might or might not be enabled by default) Defaults to 1 (timebased random number). 
fold_assignment 
Crossvalidation fold assignment scheme, if fold_column is not specified. The 'Stratified' option will stratify the folds based on the response variable, for classification problems. Must be one of: "AUTO", "Random", "Modulo", "Stratified". Defaults to AUTO. 
fold_column 
Column with crossvalidation fold index assignment per observation. 
keep_cross_validation_predictions 

keep_cross_validation_fold_assignment 

validation_frame 
Id of the validation data frame. 
ignore_const_cols 

score_each_iteration 

balance_classes 

class_sampling_factors 
Desired over/undersampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes. 
max_after_balance_size 
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes. Defaults to 5.0. 
max_hit_ratio_k 
Max. number (top K) of predictions to use for hit ratio computation (for multiclass only, 0 to disable) Defaults to 0. 
laplace 
Laplace smoothing parameter Defaults to 0. 
threshold 
This argument is deprecated, use 'min_sdev' instead. The minimum standard deviation to use for observations without enough data. Must be at least 1e10. 
min_sdev 
The minimum standard deviation to use for observations without enough data. Must be at least 1e10. 
eps 
This argument is deprecated, use 'eps_sdev' instead. A threshold cutoff to deal with numeric instability, must be positive. 
eps_sdev 
A threshold cutoff to deal with numeric instability, must be positive. 
min_prob 
Min. probability to use for observations with not enough data. 
eps_prob 
Cutoff below which probability is replaced with min_prob. 
compute_metrics 

max_runtime_secs 
Maximum allowed runtime in seconds for model training. Use 0 to disable. Defaults to 0. 
The naive Bayes classifier assumes independence between predictor variables conditional on the response, and a Gaussian distribution of numeric predictors with mean and standard deviation computed from the training dataset. When building a naive Bayes classifier, every row in the training dataset that contains at least one NA will be skipped completely. If the test dataset has missing values, then those predictors are omitted in the probability calculation during prediction.
Returns an object of class H2OBinomialModel if the response has two categorical levels, and H2OMultinomialModel otherwise.
1 2 3 4  h2o.init()
votesPath < system.file("extdata", "housevotes.csv", package="h2o")
votes.hex < h2o.uploadFile(path = votesPath, header = TRUE)
h2o.naiveBayes(x = 2:17, y = 1, training_frame = votes.hex, laplace = 3)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.