drop_invar_cols: Drop invariant columns from a dataframe

View source: R/dataframe_tools.R

drop_invar_colsR Documentation

Drop invariant columns from a dataframe

Description

Deletes columns from a dataframe if they do not vary. For character and factor columns, this means that every row of the column contains exactly the same string. For numeric columns, the numbers are rounded to a nearest common value and then checked to see if every rounded number is the same.

Usage

drop_invar_cols(
  df,
  from = 1,
  to = NULL,
  cols = NULL,
  nearest = NULL,
  dir = NULL
)

Arguments

df

(Dataframe) A dataframe.

from, to

(Numeric or NULL) The start and end of a continuous range of columns that will be used. If to is NULL, it defaults to the last column in df so that ⁠from = 2, to = NULL⁠ is the same as 2:length(df).

cols

(Numeric or NULL) A numeric vector of the columns to consider. This allows you to select non-contiguous columns. If the cols argument is being used (not-NULL), from and to will be ignored.

nearest

(Numeric or NULL) For numeric columns, this is the common value that all numbers will be rounded to. The default NULL uses the mean() of each column as the rounding target.

dir

(Character or NULL) Controls the rounding function used. Leave as NULL to round up and down. Use "up" to round up only. Use "down" to round down only.

Value

A copy of df with all invariant columns removed.

Authors

Examples

df <- data.frame(stringsAsFactors=FALSE,
         char_invar = c("A", "A", "A", "A", "A"),
           char_var = c("A", "A", "A", "B", "A"),
          num_invar = c(1L, 1L, 1L, 1L, 1L),
         num_mean_0 = c(0, -0.1, 0.1, 0.01, -0.01),
            num_var = c(0, 0.2, 0.8, 0.03, 0.4)
      )

df

#>   char_invar char_var num_invar num_mean_0 num_var
#> 1          A        A         1       0.00    0.00
#> 2          A        A         1      -0.10    0.20
#> 3          A        A         1       0.10    0.80
#> 4          A        B         1       0.01    0.03
#> 5          A        A         1      -0.01    0.40


drop_invar_cols(df)

#>   char_var num_var
#> 1        A    0.00
#> 2        A    0.20
#> 3        A    0.80
#> 4        B    0.03
#> 5        A    0.40


DesiQuintans/desiderata documentation built on April 9, 2023, 5:43 a.m.