impute_missing: Imputes missing values

Description Usage Arguments Examples

View source: R/impute_missing.R

Description

This function imputes missing values in a single data frame or a list of data frames

Usage

1
impute_missing(data_object, method = "randomforest", threshold = 0.1)

Arguments

data_object

argument is the output produced by as.MLinput, which contains a single x data frame or a list of x data frames, a y data frames and attributes

method

argument specifies which imputation package to use, missForest, mice, amelia

threshold

argument is a percentage, if a column in x data frame has less than threshold percent of missing values then data will be imputed. But if a column has more missing values than the percent threshold, these columns will be dropped from x data frame

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
dontrun{
library(peppuR)
library(missForest)
library(mice)

data('single_source')
data('multi_source')

x_multi = multi_source$X
y_multi = multi_source$Y

x_single = single_source$X
y_single = single_source$Y

sample_cname = 'ID'
outcome_cname = 'Group'
pair_cname = 'paircol'

result = as.MLinput(x = x_single, y = y_single, categorical_features = T , sample_cname = sample_cname, outcome_cname = outcome_cname, pair_cname = pair_cname)
result2 = as.MLinput(x = x_multi, y = y_multi, categorical_features = T, sample_cname = sample_cname, outcome_cname = outcome_cname, pair_cname = pair_cname)

imputed_res = impute_missing(result, method = 'randomforest')
imputed_res2 = impute_missing(result2, method = 'randomforest')

}

pmartR/peppuR documentation built on Jan. 17, 2020, 12:54 p.m.