View source: R/remove_almost_constant.R
| remove_almost_constant | R Documentation |
Test all columns specified by .what and remove those that are almost
constant. A column is considered almost constant if the proportion of its
most frequent value is greater than or equal to the threshold specified by
.threshold. See is_almost_constant() for further details.
remove_almost_constant(
.data,
.what = everything(),
...,
.threshold = 1,
.na_rm = FALSE,
.verbose = FALSE
)
.data |
A data frame. |
.what |
A tidyselect expression (see tidyselect syntax) specifying the columns to process. |
... |
Additional tidyselect expressions selecting more columns. |
.threshold |
Numeric scalar in the interval |
.na_rm |
Logical; if |
.verbose |
Logical; if |
A data frame with all selected columns removed that meet the definition of being almost constant.
Michal Burda
is_almost_constant(), remove_ill_conditions()
d <- data.frame(a1 = 1:10,
a2 = c(1:9, NA),
b1 = "b",
b2 = NA,
c1 = rep(c(TRUE, FALSE), 5),
c2 = rep(c(TRUE, NA), 5),
d = c(rep(TRUE, 4), rep(FALSE, 4), NA, NA))
# Remove columns that are constant (threshold = 1)
remove_almost_constant(d, .threshold = 1.0, .na_rm = FALSE)
remove_almost_constant(d, .threshold = 1.0, .na_rm = TRUE)
# Remove columns where the majority value occurs in >= 50% of rows
remove_almost_constant(d, .threshold = 0.5, .na_rm = FALSE)
remove_almost_constant(d, .threshold = 0.5, .na_rm = TRUE)
# Restrict check to a subset of columns
remove_almost_constant(d, a1:b2, .threshold = 0.5, .na_rm = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.