metafeatures: Extract meta-features from a dataset

Description Usage Arguments Details Value Examples

View source: R/metafeatures.R

Description

This is a simple way to extract the meta-features from a dataset, where all meta-features from each group is extracted.

Usage

1
2
3
4
5
6
7
metafeatures(...)

## Default S3 method:
metafeatures(x, y, groups = "default", summary = c("mean", "sd"), ...)

## S3 method for class 'formula'
metafeatures(formula, data, groups = "default", summary = c("mean", "sd"), ...)

Arguments

...

Optional arguments to the summary methods.

x

A data.frame contained only the input attributes.

y

A factor response vector with one label for each row/component of x.

groups

A list of meta-features groups, "default" for traditional groups of meta-features or "all" to include all them. The details section describes the valid values for this parameter.

summary

A list of summarization functions or empty for all values. See post.processing method to more information. (Default: c("mean", "sd"))

formula

A formula to define the class column.

data

A data.frame dataset contained the input attributes and class The details section describes the valid values for this group.

Details

The following groups are allowed for this method:

"infotheo"

Include all information theoretical meta-features. See infotheo for more details.

"general"

Include all general (simple) meta-features. See general for more details.

"landmarking"

Include all landmarking meta-features. See landmarking for more details.

"model.based"

Include all model based meta-features. See model.based for more details.

"statistical"

Include all statistical meta-features. See statistical for more details.

"clustering"

Include all clustering meta-features. See clustering for more details.

"complexity"

Include all complexity meta-features. See complexity for more details.

"concept"

Include all concept variation meta-features. See concept for more details.

"itemset"

Include all itemset meta-features. See itemset for more details.

Value

A numeric vector named by the meta-features from the specified groups.

Examples

1
2
3
4
5
6
7
8
## Extract all meta-features
metafeatures(Species ~ ., iris)

## Extract some groups of meta-features
metafeatures(iris[1:4], iris[5], c("general", "statistical", "infotheo"))

## Use another summary methods
metafeatures(Species ~ ., iris, summary=c("min", "median", "max"))

Example output

                      discriminant.cancor 
                               0.98482089 
                discriminant.cancor.fract 
                               0.81372024 
           discriminant.center.of.gravity 
                               3.20828116 
                     discriminant.discfct 
                               0.66666667 
                 discriminant.eigen.fract 
                               0.92461872 
              discriminant.max.eigenvalue 
                               4.22824171 
             discriminant.min.eighenvalue 
                               0.02383509 
                     discriminant.sdratio 
                               1.27722888 
                     discriminant.wlambda 
                               0.34245841 
              general.defective.instances 
                               0.00000000 
                   general.dimensionality 
                               0.02666667 
                   general.majority.class 
                               0.33333333 
                   general.missing.values 
                               0.00000000 
                       general.nattribute 
                               4.00000000 
                          general.nbinary 
                               0.00000000 
                          general.nclasse 
                               3.00000000 
                        general.ninstance 
                             150.00000000 
                         general.nnumeric 
                               4.00000000 
                        general.nsymbolic 
                               0.00000000 
                          general.pbinary 
                               0.00000000 
                         general.pnumeric 
                               1.00000000 
                        general.psymbolic 
                               0.00000000 
                          general.sdclass 
                               0.00000000 
   infotheo.attributes.concentration.mean 
                               0.20980486 
     infotheo.attributes.concentration.sd 
                               0.11958799 
          infotheo.attribute.entropy.mean 
                               0.98073290 
            infotheo.attribute.entropy.sd 
                               0.02628825 
        infotheo.class.concentration.mean 
                               0.51481787 
          infotheo.class.concentration.sd 
                               0.25900418 
                   infotheo.class.entropy 
                               1.00000000 
           infotheo.equivalent.attributes 
                               1.87806411 
              infotheo.joint.entropy.mean 
                               3.01821959 
                infotheo.joint.entropy.sd 
                               0.38218827 
         infotheo.mutual.information.mean 
                               0.84393418 
           infotheo.mutual.information.sd 
                               0.42220265 
                    infotheo.noise.signal 
                               1.69830435 
         landmarking.decision.stumps.mean 
                               0.94444444 
           landmarking.decision.stumps.sd 
                               0.06085806 
  landmarking.elite.nearest.neighbor.mean 
                               0.96666667 
    landmarking.elite.nearest.neighbor.sd 
                               0.04548588 
     landmarking.linear.discriminant.mean 
                               0.88888889 
       landmarking.linear.discriminant.sd 
                               0.12904997 
             landmarking.naive.bayes.mean 
                               0.94222222 
               landmarking.naive.bayes.sd 
                               0.07161697 
        landmarking.nearest.neighbor.mean 
                               0.97333333 
          landmarking.nearest.neighbor.sd 
                               0.04143036 
              landmarking.worst.node.mean 
                               0.82222222 
                landmarking.worst.node.sd 
                               0.16913875 
model.based.average.leaf.corrobation.mean 
                               0.33333333 
  model.based.average.leaf.corrobation.sd 
                               0.02666667 
           model.based.branch.length.mean 
                               1.66666667 
             model.based.branch.length.sd 
                               0.57735027 
                   model.based.depth.mean 
                               1.20000000 
                     model.based.depth.sd 
                               0.83666003 
             model.based.homogeneity.mean 
                               6.00000000 
               model.based.homogeneity.sd 
                               0.00000000 
                    model.based.max.depth 
                               2.00000000 
                       model.based.nleave 
                               3.00000000 
                        model.based.nnode 
                               2.00000000 
          model.based.nodes.per.attribute 
                               0.50000000 
           model.based.nodes.per.instance 
                               0.01333333 
         model.based.nodes.per.level.mean 
                               1.00000000 
           model.based.nodes.per.level.sd 
                               0.00000000 
          model.based.repeated.nodes.mean 
                               0.50000000 
            model.based.repeated.nodes.sd 
                               0.57735027 
                   model.based.shape.mean 
                               0.50000000 
                     model.based.shape.sd 
                               0.00000000 
     model.based.variable.importance.mean 
                              22.24235105 
       model.based.variable.importance.sd 
                              26.07505668 
             statistical.correlation.mean 
                               0.48505297 
               statistical.correlation.sd 
                               0.20939015 
              statistical.covariance.mean 
                               0.07154263 
                statistical.covariance.sd 
                               0.07130389 
     statistical.discreteness.degree.mean 
                               1.00000000 
       statistical.discreteness.degree.sd 
                               0.00000000 
          statistical.geometric.mean.mean 
                               3.44476412 
            statistical.geometric.mean.sd 
                               2.01825110 
           statistical.harmonic.mean.mean 
                               3.46450000 
             statistical.harmonic.mean.sd 
                               1.97548999 
                     statistical.iqr.mean 
                               1.28810588 
                       statistical.iqr.sd 
                               0.25053916 
                statistical.kurtosis.mean 
                               0.55691608 
                  statistical.kurtosis.sd 
                               0.28615735 
                     statistical.mad.mean 
                               0.35211750 
                       statistical.mad.sd 
                               0.19259539 
               statistical.normality.mean 
                               0.66666667 
                 statistical.normality.sd 
                               0.57735027 
                statistical.outliers.mean 
                               0.00000000 
                  statistical.outliers.sd 
                               0.00000000 
                statistical.skewness.mean 
                               0.29715985 
                  statistical.skewness.sd 
                               0.33328611 
      statistical.standard.deviation.mean 
                               0.35776315 
        statistical.standard.deviation.sd 
                               0.16137542 
               statistical.trim.mean.mean 
                               3.45583333 
                 statistical.trim.mean.sd 
                               2.01128389 
                statistical.variance.mean 
                               0.15186633 
                  statistical.variance.sd 
                               0.12214091 
           general.defective.instances                 general.dimensionality 
                            0.00000000                             0.02666667 
                general.majority.class                 general.missing.values 
                            0.33333333                             0.00000000 
                    general.nattribute                        general.nbinary 
                            4.00000000                             0.00000000 
                       general.nclasse                      general.ninstance 
                            3.00000000                           150.00000000 
                      general.nnumeric                      general.nsymbolic 
                            4.00000000                             0.00000000 
                       general.pbinary                       general.pnumeric 
                            0.00000000                             1.00000000 
                     general.psymbolic                        general.sdclass 
                            0.00000000                             0.00000000 
          statistical.correlation.mean             statistical.correlation.sd 
                            0.48505297                             0.20939015 
           statistical.covariance.mean              statistical.covariance.sd 
                            0.07154263                             0.07130389 
  statistical.discreteness.degree.mean     statistical.discreteness.degree.sd 
                            1.00000000                             0.00000000 
       statistical.geometric.mean.mean          statistical.geometric.mean.sd 
                            3.44476412                             2.01825110 
        statistical.harmonic.mean.mean           statistical.harmonic.mean.sd 
                            3.46450000                             1.97548999 
                  statistical.iqr.mean                     statistical.iqr.sd 
                            1.28810588                             0.25053916 
             statistical.kurtosis.mean                statistical.kurtosis.sd 
                            0.55691608                             0.28615735 
                  statistical.mad.mean                     statistical.mad.sd 
                            0.35211750                             0.19259539 
            statistical.normality.mean               statistical.normality.sd 
                            0.66666667                             0.57735027 
             statistical.outliers.mean                statistical.outliers.sd 
                            0.00000000                             0.00000000 
             statistical.skewness.mean                statistical.skewness.sd 
                            0.29715985                             0.33328611 
   statistical.standard.deviation.mean      statistical.standard.deviation.sd 
                            0.35776315                             0.16137542 
            statistical.trim.mean.mean               statistical.trim.mean.sd 
                            3.45583333                             2.01128389 
             statistical.variance.mean                statistical.variance.sd 
                            0.15186633                             0.12214091 
infotheo.attributes.concentration.mean   infotheo.attributes.concentration.sd 
                            0.20980486                             0.11958799 
       infotheo.attribute.entropy.mean          infotheo.attribute.entropy.sd 
                            0.98073290                             0.02628825 
     infotheo.class.concentration.mean        infotheo.class.concentration.sd 
                            0.51481787                             0.25900418 
                infotheo.class.entropy         infotheo.equivalent.attributes 
                            1.00000000                             1.87806411 
           infotheo.joint.entropy.mean              infotheo.joint.entropy.sd 
                            3.01821959                             0.38218827 
      infotheo.mutual.information.mean         infotheo.mutual.information.sd 
                            0.84393418                             0.42220265 
                 infotheo.noise.signal 
                            1.69830435 
                        discriminant.cancor 
                               9.848209e-01 
                  discriminant.cancor.fract 
                               8.137202e-01 
             discriminant.center.of.gravity 
                               3.208281e+00 
                       discriminant.discfct 
                               6.666667e-01 
                   discriminant.eigen.fract 
                               9.246187e-01 
                discriminant.max.eigenvalue 
                               4.228242e+00 
               discriminant.min.eighenvalue 
                               2.383509e-02 
                       discriminant.sdratio 
                               1.277229e+00 
                       discriminant.wlambda 
                               3.424584e-01 
                general.defective.instances 
                               0.000000e+00 
                     general.dimensionality 
                               2.666667e-02 
                     general.majority.class 
                               3.333333e-01 
                     general.missing.values 
                               0.000000e+00 
                         general.nattribute 
                               4.000000e+00 
                            general.nbinary 
                               0.000000e+00 
                            general.nclasse 
                               3.000000e+00 
                          general.ninstance 
                               1.500000e+02 
                           general.nnumeric 
                               4.000000e+00 
                          general.nsymbolic 
                               0.000000e+00 
                            general.pbinary 
                               0.000000e+00 
                           general.pnumeric 
                               1.000000e+00 
                          general.psymbolic 
                               0.000000e+00 
                            general.sdclass 
                               0.000000e+00 
      infotheo.attributes.concentration.min 
                               8.478340e-02 
   infotheo.attributes.concentration.median 
                               1.846740e-01 
      infotheo.attributes.concentration.max 
                               4.299568e-01 
             infotheo.attribute.entropy.min 
                               9.415587e-01 
          infotheo.attribute.entropy.median 
                               9.920377e-01 
             infotheo.attribute.entropy.max 
                               9.972975e-01 
           infotheo.class.concentration.min 
                               2.227820e-01 
        infotheo.class.concentration.median 
                               5.465452e-01 
           infotheo.class.concentration.max 
                               7.433990e-01 
                     infotheo.class.entropy 
                               1.000000e+00 
             infotheo.equivalent.attributes 
                               1.878064e+00 
                 infotheo.joint.entropy.min 
                               2.682002e+00 
              infotheo.joint.entropy.median 
                               2.990150e+00 
                 infotheo.joint.entropy.max 
                               3.410577e+00 
            infotheo.mutual.information.min 
                               3.606172e-01 
         infotheo.mutual.information.median 
                               9.067693e-01 
            infotheo.mutual.information.max 
                               1.201581e+00 
                      infotheo.noise.signal 
                               1.698304e+00 
            landmarking.decision.stumps.min 
                               8.000000e-01 
         landmarking.decision.stumps.median 
                               9.333333e-01 
            landmarking.decision.stumps.max 
                               1.000000e+00 
     landmarking.elite.nearest.neighbor.min 
                               8.666667e-01 
  landmarking.elite.nearest.neighbor.median 
                               1.000000e+00 
     landmarking.elite.nearest.neighbor.max 
                               1.000000e+00 
        landmarking.linear.discriminant.min 
                               6.000000e-01 
     landmarking.linear.discriminant.median 
                               9.333333e-01 
        landmarking.linear.discriminant.max 
                               1.000000e+00 
                landmarking.naive.bayes.min 
                               7.333333e-01 
             landmarking.naive.bayes.median 
                               9.666667e-01 
                landmarking.naive.bayes.max 
                               1.000000e+00 
           landmarking.nearest.neighbor.min 
                               8.666667e-01 
        landmarking.nearest.neighbor.median 
                               1.000000e+00 
           landmarking.nearest.neighbor.max 
                               1.000000e+00 
                 landmarking.worst.node.min 
                               5.333333e-01 
              landmarking.worst.node.median 
                               8.000000e-01 
                 landmarking.worst.node.max 
                               1.000000e+00 
   model.based.average.leaf.corrobation.min 
                               3.066667e-01 
model.based.average.leaf.corrobation.median 
                               3.333333e-01 
   model.based.average.leaf.corrobation.max 
                               3.600000e-01 
              model.based.branch.length.min 
                               1.000000e+00 
           model.based.branch.length.median 
                               2.000000e+00 
              model.based.branch.length.max 
                               2.000000e+00 
                      model.based.depth.min 
                               0.000000e+00 
                   model.based.depth.median 
                               1.000000e+00 
                      model.based.depth.max 
                               2.000000e+00 
                model.based.homogeneity.min 
                               6.000000e+00 
             model.based.homogeneity.median 
                               6.000000e+00 
                model.based.homogeneity.max 
                               6.000000e+00 
                      model.based.max.depth 
                               2.000000e+00 
                         model.based.nleave 
                               3.000000e+00 
                          model.based.nnode 
                               2.000000e+00 
            model.based.nodes.per.attribute 
                               5.000000e-01 
             model.based.nodes.per.instance 
                               1.333333e-02 
            model.based.nodes.per.level.min 
                               1.000000e+00 
         model.based.nodes.per.level.median 
                               1.000000e+00 
            model.based.nodes.per.level.max 
                               1.000000e+00 
             model.based.repeated.nodes.min 
                               0.000000e+00 
          model.based.repeated.nodes.median 
                               5.000000e-01 
             model.based.repeated.nodes.max 
                               1.000000e+00 
                      model.based.shape.min 
                               5.000000e-01 
                   model.based.shape.median 
                               5.000000e-01 
                      model.based.shape.max 
                               5.000000e-01 
        model.based.variable.importance.min 
                               0.000000e+00 
     model.based.variable.importance.median 
                               1.948470e+01 
        model.based.variable.importance.max 
                               5.000000e+01 
                statistical.correlation.min 
                               1.777000e-01 
             statistical.correlation.median 
                               4.915693e-01 
                statistical.correlation.max 
                               8.642247e-01 
                 statistical.covariance.min 
                               6.069388e-03 
              statistical.covariance.median 
                               5.243673e-02 
                 statistical.covariance.max 
                               3.032898e-01 
        statistical.discreteness.degree.min 
                               1.000000e+00 
     statistical.discreteness.degree.median 
                               1.000000e+00 
        statistical.discreteness.degree.max 
                               1.000000e+00 
             statistical.geometric.mean.min 
                               2.265819e-01 
          statistical.geometric.mean.median 
                               3.182047e+00 
             statistical.geometric.mean.max 
                               6.557795e+00 
              statistical.harmonic.mean.min 
                               1.000000e-01 
           statistical.harmonic.mean.median 
                               3.200000e+00 
              statistical.harmonic.mean.max 
                               7.900000e+00 
                        statistical.iqr.min 
                               9.488963e-01 
                     statistical.iqr.median 
                               1.264961e+00 
                        statistical.iqr.max 
                               1.820498e+00 
                   statistical.kurtosis.min 
                               1.902555e-01 
                statistical.kurtosis.median 
                               5.683174e-01 
                   statistical.kurtosis.max 
                               1.258718e+00 
                        statistical.mad.min 
                               0.000000e+00 
                     statistical.mad.median 
                               2.965200e-01 
                        statistical.mad.max 
                               6.671700e-01 
                  statistical.normality.min 
                               0.000000e+00 
               statistical.normality.median 
                               1.000000e+00 
                  statistical.normality.max 
                               1.000000e+00 
                   statistical.outliers.min 
                               0.000000e+00 
                statistical.outliers.median 
                               0.000000e+00 
                   statistical.outliers.max 
                               0.000000e+00 
                   statistical.skewness.min 
                               2.933377e-02 
                statistical.skewness.median 
                               1.173949e-01 
                   statistical.skewness.max 
                               1.179633e+00 
         statistical.standard.deviation.min 
                               1.053856e-01 
      statistical.standard.deviation.median 
                               3.374932e-01 
         statistical.standard.deviation.max 
                               6.358796e-01 
                  statistical.trim.mean.min 
                               2.200000e-01 
               statistical.trim.mean.median 
                               3.186667e+00 
                  statistical.trim.mean.max 
                               6.546667e+00 
                   statistical.variance.min 
                               1.110612e-02 
                statistical.variance.median 
                               1.141265e-01 
                   statistical.variance.max 
                               4.043429e-01 

mfe documentation built on July 1, 2020, 10:46 p.m.