general: General meta-features

Description Usage Arguments Details Value References See Also Examples

View source: R/general.R

Description

General meta-features include general information related to the dataset. It is also known as simple measures.

Usage

1
2
3
4
5
6
7
general(...)

## Default S3 method:
general(x, y, features = "all", summary = c("mean", "sd"), ...)

## S3 method for class 'formula'
general(formula, data, features = "all", summary = c("mean", "sd"), ...)

Arguments

...

Not used.

x

A data.frame contained only the input attributes.

y

A factor response vector with one label for each row/component of x.

features

A list of features names or "all" to include all them. The supported values are described in the details section. (Default: "all")

summary

A list of summarization functions or empty for all values. See post.processing method to more information. (Default: c("mean", "sd"))

formula

A formula to define the class column.

data

A data.frame dataset contained the input attributes and class

Details

The following features are allowed for this method:

"attrToInst"

Ratio of the number of attributes per the number of instances, also known as dimensionality.

"catToNum"

Ratio of the number of categorical attributes per the number of numeric attributes.

"freqClass"

Proportion of the classes values (multi-valued).

"instToAttr"

Ratio of the number of instances per the number of attributes.

"nrAttr"

Number of attributes.

"nrBin"

Number of binary attributes.

"nrCat"

Number of categorical attributes.

"nrClass"

Number of classes.

"nrInst"

Number of instances.

"nrNum"

Number of numeric attributes.

"numToCat"

Ratio of the number of numeric attributes per the number of categorical attributes.

Value

A list named by the requested meta-features.

References

Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

Guido Lindner and Rudi Studer. AST: Support for algorithm selection with a CBR approach. In European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 418 - 423, 1999.

Ciro Castiello, Giovanna Castellano, and Anna M. Fanelli. Meta-data: Characterization of input features for meta-learning. In 2nd International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pages 457 - 468, 2005.

See Also

Other meta-features: clustering(), complexity(), concept(), infotheo(), itemset(), landmarking(), model.based(), relative(), statistical()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## Extract all metafeatures
general(Species ~ ., iris)

## Extract some metafeatures
general(iris[1:100, 1:4], iris[1:100, 5], c("nrAttr", "nrClass"))

## Extract all meta-features without summarize prop.class
general(Species ~ ., iris, summary=c())

## Use another summarization functions
general(Species ~ ., iris, summary=c("sd","min","iqr"))

Example output

$attrToInst
[1] 0.02666667

$catToNum
[1] 0

$freqClass
     mean        sd 
0.3333333 0.0000000 

$instToAttr
[1] 37.5

$nrAttr
[1] 4

$nrBin
[1] 0

$nrCat
[1] 0

$nrClass
[1] 3

$nrInst
[1] 150

$nrNum
[1] 4

$numToCat
[1] NA

$nrAttr
[1] 4

$nrClass
[1] 2

$attrToInst
[1] 0.02666667

$catToNum
[1] 0

$freqClass
non.aggregated1 non.aggregated2 non.aggregated3 
      0.3333333       0.3333333       0.3333333 

$instToAttr
[1] 37.5

$nrAttr
[1] 4

$nrBin
[1] 0

$nrCat
[1] 0

$nrClass
[1] 3

$nrInst
[1] 150

$nrNum
[1] 4

$numToCat
[1] NA

$attrToInst
[1] 0.02666667

$catToNum
[1] 0

$freqClass
       sd       min       iqr 
0.0000000 0.3333333 0.0000000 

$instToAttr
[1] 37.5

$nrAttr
[1] 4

$nrBin
[1] 0

$nrCat
[1] 0

$nrClass
[1] 3

$nrInst
[1] 150

$nrNum
[1] 4

$numToCat
[1] NA

mfe documentation built on July 1, 2020, 10:46 p.m.