Description Usage Arguments Details Value Author(s) See Also
Removes variable names from a list of variables that contain only, or a large portion of, NA values or have zero bandwidth (if they are numeric) and returns the variable names.
1 2 3 4 5 6 7 | remove_empty_features(
all.features,
dataset,
percentage_NA_allowed = NA,
bandwidth = (.Machine$double.eps^0.5),
verbose = FALSE
)
|
all.features |
a character vector with all column names of |
dataset |
the dataset as a data.frame |
percentage_NA_allowed |
the percentage of missing values per vector that should be allowed without removing the feature. All features with NA values that are higher than this level are excluded. |
bandwidth |
The length of the interval that values of variable must exceed to be not
removed. By default, half of |
verbose |
boolean if debug messages should be printed when a variable is removed from the list (uses futile.logger package) |
The function checks all given column names for the portion of NA values.
If the number of NA of Inf exceeds percentage_NA_allowed
,
the column name is removed from the variable set. Besides, all numeric
variables are checked if they have almost zero bandwidth
, are removed.
a vector of variable names that are not considered as empty
Konstantin Hopf konstantin.hopf@uni-bamberg.de
naInf_omit, replaceNAsFeatures
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.