sboost: sboost Learning Algorithm

View source: R/sboost.R

sboostR Documentation

sboost Learning Algorithm

Description

A machine learning algorithm using AdaBoost on decision stumps.

Usage

sboost(features, outcomes, iterations = 1, positive = NULL, verbose = FALSE)

Arguments

features

feature set data.frame.

outcomes

outcomes corresponding to the features.

iterations

number of boosts.

positive

the positive outcome to test for; if NULL, the first outcome in alphabetical (or numerical) order will be chosen.

verbose

If true, progress bar will be displayed in console.

Details

Factors and characters are treated as categorical features. Missing values are supported.

See https://jadonwagstaff.github.io/projects/sboost.html for a description of the algorithm.

For original paper describing AdaBoost see:

Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119-139 (1997)

Value

An sboost_classifier S3 object containing:

classifier

stump - the index of the decision stump
feature - name of the column that this stump splits on.
vote - the weight that this stump has on the final classifier.
orientation - shows how outcomes are split. If feature is numeric shows split orientation, if feature value is less than split then vote is cast in favor of left side outcome, otherwise the vote is cast for the right side outcome. If feature is categorical, vote is cast for the left side outcome if feature value is found in left_categories, otherwise vote is cast for right side outcome.
split - if feature is numeric, the value where the decision stump splits the outcomes; otherwise, NA.
left_categories - if feature is categorical, shows the feature values that sway the vote to the left side outcome on the orientation split; otherwise, NA.

outcomes

Shows which outcome was considered as positive and which negative.

training

stumps - how many decision stumps were trained.
features - how many features the training set contained.
instances - how many instances or rows the training set contained.
positive_prevalence - what fraction of the training instances were positive.

call

Shows the parameters that were used to build the classifier.

See Also

predict.sboost_classifier - to get predictions from the classifier.

assess - to evaluate the performance of the classifier.

validate - to perform cross validation for the classifier training.

Examples

# malware
malware_classifier <- sboost(malware[-1], malware[1], iterations = 5, positive = 1)
malware_classifier
malware_classifier$classifier

# mushrooms
mushroom_classifier <- sboost(mushrooms[-1], mushrooms[1], iterations = 5, positive = "p")
mushroom_classifier
mushroom_classifier$classifier

sboost documentation built on May 28, 2022, 1:12 a.m.