data_preproc: preprocess the data

Description Usage Arguments Value Additional arguments Author(s) Examples

View source: R/data_preproc.R

Description

Specify categorical and continuous variables and impute the missing values.

Usage

1
2
data_preproc(data, is.cat = NULL, levels = 5,
  detect.outliers = FALSE, alpha = 0.5, ...)

Arguments

data

A data.frame object or a matrix.

is.cat

A boolean vector specifies which variables are categorical. (default = NULL)

levels

An integer number indicates the maximum levels of categorical variables. It is used when is.cat in NULL. (default = 5)

detect.outliers

Logical indicating if data outliers should be detected. If TRUE outliers will be treated as NA. Defaults to FALSE.

alpha

A number between (0, 1). Rows where the ratio of the NA values in them is more than alpha will be deleted.

...

Any additional arguments.

Value

A normalized data.frame object with specified continuous and (or) categorical variables and no missing values.

Additional arguments

beta

The level of statistical significance with which to accept or reject outliers. The argument of AnomalyDetectionVec function. Defaults to 0.5.

Author(s)

Elyas Heidari, Vahid Balazadeh

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Using levels
data("NHANES")
df <- data_preproc(NHANES, levels = 15)

## Using is.cat
require(datasets)
data("mtcars")
l <- logical(11)
l[c(8, 9)] <- TRUE
df <- data_preproc(mtcars, is.cat = l)

## Detect outliers
df <- data_preproc(NHANES, levels = 15, detect.outliers = TRUE, alpha = 0.4, beta = 1)

bAIo-lab/Questools documentation built on Nov. 9, 2019, 3:59 a.m.