analyze_dataset: Analyze the given dataset.

Description Usage Arguments Value

Description

Find the attribute interactions in the given dataset.

Usage

1
2
3
4
analyze_dataset(dataset, classname = "class", seed_traintest = 42,
  seed = 42, classifier = NULL, alpha = 0.05, R = (1 +
  ceiling(1/alpha)), Rmin = 250, Rmax = 500, z = 2.57,
  prune_singletons = TRUE, parallel = TRUE, early_stopping = FALSE)

Arguments

dataset

The dataset to analyze

classname

The name of the class attribute in the dataset. Default is class.

seed_traintest

Random seed used for splitting the data into training and testing sets, default is 42.

seed

Random seed, default is 42.

classifier

The classifier to be used, as a string. Default is NULL.

alpha

Significance level (default is 0.05).

R

Number of samples to use for calculating the accuracy. Default is R = 1 + (1 / alpha).

Rmin

Ninimum number of replications to use for calculating p-values. Default is 250.

Rmax

Maximum number of replications to use for calculating p-values. Default is 500.

z

Parameter for confidence band for p-values. Use 1.96 for 95 percent, 2.25 for 97.5 percent and 2.57 for 99 percent. Default is 2.57.

prune_singletons

Should singletons be pruned. Default is TRUE.

parallel

Calculate p-values in parallel (Boolean, default is TRUE). If parallel is used the results are not deterministic using the same random seed.

early_stopping

Use early stopping. Default is FALSE in which case Rmin samples are used to calculate p-values. If TRUE at least Rmin and at most Rmax values are used.

Value

A results strucure (list with fields).


bwrc/astrid-r documentation built on May 13, 2019, 9:08 a.m.