impute: Impute Missing Values

Description Usage Arguments Value See Also Examples

Description

impute_mode: Impute NAs by the modes of their corresponding columns.

impute_median: Impute NAs by the medians of their corresponding columns.

impute_mean: Impute NAs by the means of their corresponding columns.

Usage

1
2
3
4
5
impute_mode(x,cols=colnames(x),idx=row.names(x),log = eval.parent(in_log_default))

impute_median(x,cols=colnames(x),idx=row.names(x),log = eval.parent(in_log_default))

impute_mean(x,cols=colnames(x),idx=row.names(x),log = eval.parent(in_log_default))

Arguments

x

The data frame to be imputed.

cols

The index of columns of x to be imputed.

idx

The index of rows of x to be used to calculate the values to impute NAs. Use this parameter to prevent leakage.

log

Controls log files. To produce log files, assign it or the log_arg variable in the parent environment (dynamic scope) a list of arguments for sink(), such as file, append, and split.

Value

An imputed data frame.

See Also

inspect_map, sink

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# refer to vignettes if you want to use log files
message('refer to vignettes if you want to use log files')

# building a data frame
A <- as.factor(c('y', 'x', 'x', 'y', 'z'))
B <- c(6, 3:6)
C <- 1:5
df <- data.frame(A, B, C)
df[3, 1] <- NA; df[2, 2] <- NA; df [5, 3] <- NA
print(df)

# imputation
df0 <- impute_mode(df, cols = 1:3)
print(df0)
df0 <- impute_mode(df, cols = 1:3, idx = 1:3)
print(df0)
df0 <- impute_median(df, cols = 2:3)
print(df0)
df0 <- impute_mean(df, cols = 2:3)
print(df0)

cleandata documentation built on May 1, 2019, 10:25 p.m.

Related to impute in cleandata...