rm_sparse_columns: Remove columns with too much empty values

View source: R/preprocessing_removal.R

rm_sparse_columnsR Documentation

Remove columns with too much empty values

Description

Remove columns with too much empty values

Usage

rm_sparse_columns(
  data,
  y,
  threshold = 0.3,
  na_indicators = c(""),
  verbose = FALSE
)

Arguments

data

A data source, that is one of the major R formats: data.table, data.frame, matrix, and so on.

y

A string that indicates a target column name.

threshold

A numeric value from [0,1] range, which indicates the maximum threshold of missing values for columns If column has more missing fields it is going to be removed. By default set to 0.3.

na_indicators

A list containing the values that will be treated as NA indicators. By default the list is c(”). WARNING Do not include NA or NaN, as these are already checked in other criterion.

verbose

A logical value, if set to TRUE, provides all information about preprocessing process, if FALSE gives none.

Value

A list containing two objects

  • `data` A dataset with deleted columns.

  • `idx` The indexes of removed columns.


ModelOriented/forester documentation built on June 6, 2024, 7:29 a.m.