rm_corrupted_observations: Remove observations with too much empty values or empty...

View source: R/preprocessing_removal.R

rm_corrupted_observationsR Documentation

Remove observations with too much empty values or empty target

Description

Remove observations with too much empty values or empty target

Usage

rm_corrupted_observations(
  data,
  y,
  threshold = 0.3,
  na_indicators = c(""),
  verbose = FALSE
)

Arguments

data

A data source, that is one of the major R formats: data.table, data.frame, matrix, and so on.

y

A string that indicates a target column name.

threshold

A numeric value from [0,1] range, which indicates the maximum threshold of missing values for observation. If observation has more missing fields it is going to be removed. By default set to 0.3.

na_indicators

A list containing the values that will be treated as NA indicators. By default the list is c(”). WARNING Do not include NA or NaN, as these are already checked in other criterion.

verbose

A logical value, if set to TRUE, provides all information about preprocessing process, if FALSE gives none.

Value

A list containing two objects

  • `data` A dataset with deleted observations.

  • `idx` The indexes of removed observations.


ModelOriented/forester documentation built on June 6, 2024, 7:29 a.m.