landmarking: Landmarking and Subsampling Landmarking Meta-features
In mfe: Meta-Feature Extractor

Description Usage Arguments Details Value References See Also Examples

Landmarking measures are simple and fast learners, from which performance can be extracted.

landmarking(...)

## Default S3 method:
landmarking(
  x,
  y,
  features = "all",
  summary = c("mean", "sd"),
  size = 1,
  folds = 10,
  score = "accuracy",
  ...
)

## S3 method for class 'formula'
landmarking(
  formula,
  data,
  features = "all",
  summary = c("mean", "sd"),
  size = 1,
  folds = 10,
  score = "accuracy",
  ...
)

`...`	Further arguments passed to the summarization functions.
`x`	A data.frame contained only the input attributes.
`y`	A factor response vector with one label for each row/component of x.
`features`	A list of features names or `"all"` to include all them.
`summary`	A list of summarization functions or empty for all values. See post.processing method to more information. (Default: `c("mean", "sd")`)
`size`	The percentage of examples subsampled. Values different from 1 generate the subsampling-based landmarking metafeatures. (Default: 1.0)
`folds`	The number of k equal size subsamples in k-fold cross-validation.(Default: 10)
`score`	The evaluation measure used to score the classification performance. `c("accuracy", "balanced.accuracy", "kappa")`. (Default: `"accuracy"`).
`formula`	A formula to define the class column.
`data`	A data.frame dataset contained the input attributes and class. The details section describes the valid values for this group.

The following features are allowed for this method:

"bestNode": Construct a single decision tree node model induced by the most informative attribute to establish the linear separability (multi-valued).
"eliteNN": Elite nearest neighbor uses the most informative attribute in the dataset to induce the 1-nearest neighbor. With the subset of informative attributes is expected that the models should be noise tolerant (multi-valued).
"linearDiscr": Apply the Linear Discriminant classifier to construct a linear split (non parallel axis) in the data to establish the linear separability (multi-valued).
"naiveBayes": Evaluate the performance of the Naive Bayes classifier. It assumes that the attributes are independent and each example belongs to a certain class based on the Bayes probability (multi-valued).
"oneNN": Evaluate the performance of the 1-nearest neighbor classifier. It uses the euclidean distance of the nearest neighbor to determine how noisy is the data (multi-valued).
"randomNode": Construct a single decision tree node model induced by a random attribute. The combination with "bestNode" measure can establish the linear separability (multi-valued).
"worstNode": Construct a single decision tree node model induced by the worst informative attribute. The combination with "bestNode" measure can establish the linear separability (multi-valued).

A list named by the requested meta-features.

Bernhard Pfahringer, Hilan Bensusan, and Christophe Giraud-Carrier. Meta-learning by landmarking various learning algorithms. In 17th International Conference on Machine Learning (ICML), pages 743 - 750, 2000.

Other meta-features: clustering(), complexity(), concept(), general(), infotheo(), itemset(), model.based(), relative(), statistical()

## Extract all meta-features using formula
landmarking(Species ~ ., iris)

## Extract some meta-features
landmarking(iris[1:4], iris[5], c("bestNode", "randomNode", "worstNode"))

## Use another summarization function
landmarking(Species ~ ., iris, summary=c("min", "median", "max"))

## Use 2 folds and balanced accuracy
landmarking(Species ~ ., iris, folds=2, score="balanced.accuracy")

## Extract the subsapling landmarking
landmarking(Species ~ ., iris, size=0.7)