relative: Relative Landmarking Meta-features

Description Usage Arguments Details Value References See Also Examples

View source: R/relative.R

Description

Relative Landmarking measures are landmarking measures using ranking strategy.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
relative(...)

## Default S3 method:
relative(
  x,
  y,
  features = "all",
  summary = c("mean", "sd"),
  size = 1,
  folds = 10,
  score = "accuracy",
  ...
)

## S3 method for class 'formula'
relative(
  formula,
  data,
  features = "all",
  summary = c("mean", "sd"),
  size = 1,
  folds = 10,
  score = "accuracy",
  ...
)

Arguments

...

Further arguments passed to the summarization functions.

x

A data.frame contained only the input attributes.

y

A factor response vector with one label for each row/component of x.

features

A list of features names or "all" to include all them.

summary

A list of summarization functions or empty for all values. See post.processing method to more information. (Default: c("mean", "sd"))

size

The percentage of examples subsampled. Values different from 1 generate the subsampling-based relative landmarking metafeatures. (Default: 1.0)

folds

The number of k equal size subsamples in k-fold cross-validation.(Default: 10)

score

The evaluation measure used to score the classification performance. c("accuracy", "balanced.accuracy", "kappa"). (Default: "accuracy").

formula

A formula to define the class column.

data

A data.frame dataset contained the input attributes and class. The details section describes the valid values for this group.

Details

The following features are allowed for this method:

"bestNode"

Construct a single decision tree node model induced by the most informative attribute to establish the linear separability (multi-valued).

"eliteNN"

Elite nearest neighbor uses the most informative attribute in the dataset to induce the 1-nearest neighbor. With the subset of informative attributes is expected that the models should be noise tolerant (multi-valued).

"linearDiscr"

Apply the Linear Discriminant classifier to construct a linear split (non parallel axis) in the data to establish the linear separability (multi-valued).

"naiveBayes"

Evaluate the performance of the Naive Bayes classifier. It assumes that the attributes are independent and each example belongs to a certain class based on the Bayes probability (multi-valued).

"oneNN"

Evaluate the performance of the 1-nearest neighbor classifier. It uses the euclidean distance of the nearest neighbor to determine how noisy is the data (multi-valued).

"randomNode"

Construct a single decision tree node model induced by a random attribute. The combination with "bestNode" measure can establish the linear separability (multi-valued).

"worstNode"

Construct a single decision tree node model induced by the worst informative attribute. The combination with "bestNode" measure can establish the linear separability (multi-valued).

Value

A list named by the requested meta-features.

References

Johannes Furnkranz, Johann Petrak, Pavel Brazdil, and Carlos Soares. On the use of Fast Subsampling Estimates for Algorithm Recommendation. Technical Report, pages 1-9, 2002.

See Also

Other meta-features: clustering(), complexity(), concept(), general(), infotheo(), itemset(), landmarking(), model.based(), statistical()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## Extract all meta-features using formula
relative(Species ~ ., iris)

## Extract some meta-features
relative(iris[1:4], iris[5], c("bestNode", "randomNode", "worstNode"))

## Use another summarization function
relative(Species ~ ., iris, summary=c("min", "median", "max"))

## Use 2 folds and balanced accuracy
relative(Species ~ ., iris, folds=2, score="balanced.accuracy")

## Extract the subsapling relative landmarking
relative(Species ~ ., iris, size=0.7)

mfe documentation built on July 1, 2020, 10:46 p.m.