Description Objects from the Class Slots See Also
A Set of Configuration Settings for the Subgroup and Pattern Mining Algorithms
Objects are created by calls of the form
SDTaskConfig(...).
attributes:The list of attributes to consider for mining. Either a vector of attribute names, or NULL (the default), which includes all attributes.
discretize:Boolean, indicating whether to (automatically)
discretize numeric attributes (default discretize=TRUE. Depends on
parameter nbins. Either creates distinct values, if their number in the
dataset is <= nbins, or applies equal-frequency discretization for the
respective numeric attribute.
method:A mining method; one of
Beam-Search beam,
BSD bsd,
SD-Map sdmap,
SD-Map enabling internal disjunctions sdmap-dis.
The default is method = "sdmap".
nbins:Specifies the number of bins to be used when
discretizing numeric attributes (see discretize above).
qf:A quality function; one of:
Adjusted Residuals ares,
Binomial Test bin,
Chi-Square Test chi2,
Gain gain,
Lift lift,
Piatetsky-Shapiro ps,
Relative Gain relgain,
Weighted Relative Accuracy wracc.
The default is qf = "ps".
k:The maximum number (top-k) of patterns
to discover, i.e., the best k rules according to the selected
quality function. The default is k = 20
minqual:The minimal quality (default minqual = 0).
minsize:The minimal size of a subgroup (as an integer)
(minimal coverage of database records, default minsize = 0).
mintp:The minimal true positive (tp) threshold, an integer
(minimal (absolute) number of true positives in a subgroup, relevant for
binary target concepts only), defaults to mintp = 0
.
maxlen:The maximal length of a description of
a pattern, i.e., the maximal number of conjunctions. This impacts both
understandability and efficiency. Simpler rules are easier to understand,
and a small maxlen will restrict the search space (default maxlen = 7).
nodefaults:Ignore default values, i.e.,
do not include the respective first value (with index 0) of each
attribute (default nodefaults=FALSE, i.e., include all values).
relfilter:Controls, whether irrelevant
patterns are filtered during pattern mining; negatively
impacts performance (default relfilter = FALSE)).
postfilter:Controls, whether a post-processing
filter is applied; one (or a vector) of:
Minimum Improvement (Global) min-improve-global,
checks the patterns against all possible generalizations,
Minimum Improvement (Pattern Set) min-improve-set,
checks the patterns against all their generalizations
in the result set,
Relevancy Filter relevancy, removes patterns that
are strictly irrelevant,
Significant Improvement (Global) sig-improve-global,
removes patterns that do not significantly improve
(default 0.01 level) w.r.t. all their possible generalizations,
Significant Improvement (Set) sig-improve-set,
removes patterns that do not significantly improve
(default 0.01 level) w.r.t. all generalizations in the result set,
Weighted Covering weighted-covering, performs weighted
covering on the data in order to select a covering set of
subgroups while reducing the overlap on the data.
By default no postfilter is set, i.e., postfilter = "".
parfilter:Provides the minimal improvement value for the postfilter (for min-improve-* filters), or the significance level (P) for sig-improve-* filters.
DiscoverSubgroups.
DiscoverSubgroupsByTask
CreateSDTask
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.