missing_val: Missing value imputation

Description Usage Arguments Value Author(s) Examples

View source: R/functions.R

Description

The function imputes the missing value in the input dataset. For numerical variables, missing values can be replaced by four possible method - 1. "mean" - mean or simple average of the non-missing values ; 2. - "median" - median or the 50th percentile of the non-missing values; 3. "mode"- mode or the value with maximum frequency among the non-mising values; 4. special extreme value of users' choice to be passes as an argument (-99999 is the default value). For categorical value, missing class can be replaced by two possible methods - 1. "mode" - mode or the class with maximum frequency among the non-mising values; 2. special class of users' choice to be passes as an argument ("missing_value" is the default class). The target column will remain unchanged.

Usage

1
2
missing_val(base, target, num_missing = -99999,
  cat_missing = "missing_value")

Arguments

base

input dataframe

target

column/field name of the target variable, to be passed as a string

num_missing

(optional) method for replacing missing values for numerical type fields - to be chosen between "mean", "median", "mode" or a value of users' choice (default value is -99999)

cat_missing

(optional) method for replacing missing values for categorical type fields - to be chosen between "mode" or a class of users' choice (default value is "missing_value")

Value

The function returns an object of class "missing_val" which is a list containing the following components:

base

a dataframe after imputing missing values

mapping_table

a dataframe with mapping between original variable and imputed missing value (if any)

Author(s)

Arya Poddar <aryapoddar290990@gmail.com>

Examples

1
2
3
4
5
6
7
8
9
data <- iris
data$Species <- as.character(data$Species)
data$Y <- sample(0:1,size=nrow(data),replace=TRUE)
data[sample(1:nrow(data),size=25),"Sepal.Length"] <- NA
data[sample(1:nrow(data),size=10),"Species"] <- NA

missing_list <- missing_val(base = data,target = "Y")
missing_list$base
missing_list$mapping_table

Example output

    Sepal.Length Sepal.Width Petal.Length Petal.Width       Species Y
1       -99999.0         3.5          1.4         0.2        setosa 1
2            4.9         3.0          1.4         0.2        setosa 1
3       -99999.0         3.2          1.3         0.2        setosa 0
4            4.6         3.1          1.5         0.2        setosa 0
5            5.0         3.6          1.4         0.2 missing_value 1
6       -99999.0         3.9          1.7         0.4        setosa 0
7            4.6         3.4          1.4         0.3 missing_value 0
8       -99999.0         3.4          1.5         0.2        setosa 1
9            4.4         2.9          1.4         0.2        setosa 0
10           4.9         3.1          1.5         0.1        setosa 1
11           5.4         3.7          1.5         0.2        setosa 0
12           4.8         3.4          1.6         0.2        setosa 0
13      -99999.0         3.0          1.4         0.1        setosa 0
14           4.3         3.0          1.1         0.1        setosa 0
15           5.8         4.0          1.2         0.2        setosa 1
16           5.7         4.4          1.5         0.4        setosa 1
17           5.4         3.9          1.3         0.4        setosa 0
18           5.1         3.5          1.4         0.3        setosa 0
19           5.7         3.8          1.7         0.3        setosa 0
20           5.1         3.8          1.5         0.3        setosa 0
21           5.4         3.4          1.7         0.2        setosa 1
22      -99999.0         3.7          1.5         0.4        setosa 0
23      -99999.0         3.6          1.0         0.2        setosa 0
24           5.1         3.3          1.7         0.5        setosa 1
25      -99999.0         3.4          1.9         0.2        setosa 1
26      -99999.0         3.0          1.6         0.2        setosa 0
27           5.0         3.4          1.6         0.4        setosa 0
28           5.2         3.5          1.5         0.2 missing_value 1
29           5.2         3.4          1.4         0.2        setosa 0
30           4.7         3.2          1.6         0.2        setosa 1
31           4.8         3.1          1.6         0.2        setosa 1
32           5.4         3.4          1.5         0.4        setosa 0
33           5.2         4.1          1.5         0.1        setosa 0
34           5.5         4.2          1.4         0.2        setosa 0
35           4.9         3.1          1.5         0.2        setosa 0
36           5.0         3.2          1.2         0.2        setosa 0
37           5.5         3.5          1.3         0.2 missing_value 0
38           4.9         3.6          1.4         0.1        setosa 0
39           4.4         3.0          1.3         0.2        setosa 1
40           5.1         3.4          1.5         0.2        setosa 0
41      -99999.0         3.5          1.3         0.3        setosa 0
42           4.5         2.3          1.3         0.3        setosa 0
43           4.4         3.2          1.3         0.2        setosa 0
44           5.0         3.5          1.6         0.6        setosa 1
45      -99999.0         3.8          1.9         0.4        setosa 1
46           4.8         3.0          1.4         0.3        setosa 0
47      -99999.0         3.8          1.6         0.2 missing_value 1
48           4.6         3.2          1.4         0.2        setosa 0
49           5.3         3.7          1.5         0.2        setosa 1
50           5.0         3.3          1.4         0.2        setosa 0
51           7.0         3.2          4.7         1.4    versicolor 1
52           6.4         3.2          4.5         1.5    versicolor 1
53           6.9         3.1          4.9         1.5    versicolor 1
54      -99999.0         2.3          4.0         1.3    versicolor 0
55           6.5         2.8          4.6         1.5    versicolor 1
56           5.7         2.8          4.5         1.3    versicolor 1
57           6.3         3.3          4.7         1.6 missing_value 0
58           4.9         2.4          3.3         1.0    versicolor 1
59           6.6         2.9          4.6         1.3    versicolor 0
60           5.2         2.7          3.9         1.4    versicolor 0
61           5.0         2.0          3.5         1.0    versicolor 1
62           5.9         3.0          4.2         1.5    versicolor 0
63      -99999.0         2.2          4.0         1.0    versicolor 1
64           6.1         2.9          4.7         1.4    versicolor 0
65           5.6         2.9          3.6         1.3    versicolor 0
66           6.7         3.1          4.4         1.4 missing_value 1
67           5.6         3.0          4.5         1.5 missing_value 1
68      -99999.0         2.7          4.1         1.0    versicolor 1
69           6.2         2.2          4.5         1.5    versicolor 1
70           5.6         2.5          3.9         1.1    versicolor 1
71           5.9         3.2          4.8         1.8    versicolor 0
72           6.1         2.8          4.0         1.3    versicolor 1
73           6.3         2.5          4.9         1.5    versicolor 0
74           6.1         2.8          4.7         1.2    versicolor 0
75      -99999.0         2.9          4.3         1.3    versicolor 0
76           6.6         3.0          4.4         1.4    versicolor 1
77           6.8         2.8          4.8         1.4    versicolor 1
78           6.7         3.0          5.0         1.7    versicolor 0
79           6.0         2.9          4.5         1.5    versicolor 1
80           5.7         2.6          3.5         1.0    versicolor 1
81           5.5         2.4          3.8         1.1    versicolor 1
82           5.5         2.4          3.7         1.0    versicolor 1
83           5.8         2.7          3.9         1.2    versicolor 1
84           6.0         2.7          5.1         1.6    versicolor 1
85           5.4         3.0          4.5         1.5    versicolor 0
86           6.0         3.4          4.5         1.6    versicolor 0
87           6.7         3.1          4.7         1.5    versicolor 0
88           6.3         2.3          4.4         1.3    versicolor 0
89           5.6         3.0          4.1         1.3    versicolor 1
90           5.5         2.5          4.0         1.3    versicolor 0
91           5.5         2.6          4.4         1.2    versicolor 0
92           6.1         3.0          4.6         1.4    versicolor 1
93      -99999.0         2.6          4.0         1.2    versicolor 0
94           5.0         2.3          3.3         1.0    versicolor 0
95           5.6         2.7          4.2         1.3    versicolor 1
96           5.7         3.0          4.2         1.2    versicolor 0
97           5.7         2.9          4.2         1.3    versicolor 1
98           6.2         2.9          4.3         1.3    versicolor 0
99      -99999.0         2.5          3.0         1.1    versicolor 1
100          5.7         2.8          4.1         1.3    versicolor 1
101     -99999.0         3.3          6.0         2.5     virginica 1
102          5.8         2.7          5.1         1.9     virginica 0
103          7.1         3.0          5.9         2.1     virginica 0
104          6.3         2.9          5.6         1.8     virginica 0
105          6.5         3.0          5.8         2.2     virginica 1
106          7.6         3.0          6.6         2.1     virginica 0
107          4.9         2.5          4.5         1.7     virginica 1
108     -99999.0         2.9          6.3         1.8     virginica 1
109          6.7         2.5          5.8         1.8     virginica 1
110     -99999.0         3.6          6.1         2.5     virginica 1
111          6.5         3.2          5.1         2.0     virginica 1
112          6.4         2.7          5.3         1.9     virginica 1
113          6.8         3.0          5.5         2.1     virginica 0
114          5.7         2.5          5.0         2.0     virginica 0
115          5.8         2.8          5.1         2.4     virginica 0
116          6.4         3.2          5.3         2.3     virginica 0
117          6.5         3.0          5.5         1.8     virginica 0
118          7.7         3.8          6.7         2.2     virginica 0
119          7.7         2.6          6.9         2.3     virginica 1
120          6.0         2.2          5.0         1.5     virginica 0
121          6.9         3.2          5.7         2.3     virginica 0
122     -99999.0         2.8          4.9         2.0     virginica 1
123          7.7         2.8          6.7         2.0     virginica 0
124          6.3         2.7          4.9         1.8     virginica 0
125          6.7         3.3          5.7         2.1     virginica 0
126          7.2         3.2          6.0         1.8     virginica 0
127          6.2         2.8          4.8         1.8     virginica 1
128          6.1         3.0          4.9         1.8     virginica 1
129          6.4         2.8          5.6         2.1     virginica 0
130          7.2         3.0          5.8         1.6     virginica 0
131     -99999.0         2.8          6.1         1.9     virginica 1
132          7.9         3.8          6.4         2.0     virginica 1
133          6.4         2.8          5.6         2.2     virginica 0
134     -99999.0         2.8          5.1         1.5     virginica 0
135          6.1         2.6          5.6         1.4     virginica 0
136     -99999.0         3.0          6.1         2.3     virginica 1
137          6.3         3.4          5.6         2.4     virginica 0
138          6.4         3.1          5.5         1.8     virginica 0
139          6.0         3.0          4.8         1.8     virginica 0
140          6.9         3.1          5.4         2.1     virginica 0
141          6.7         3.1          5.6         2.4     virginica 0
142          6.9         3.1          5.1         2.3     virginica 0
143          5.8         2.7          5.1         1.9     virginica 1
144          6.8         3.2          5.9         2.3     virginica 0
145          6.7         3.3          5.7         2.5 missing_value 0
146          6.7         3.0          5.2         2.3     virginica 0
147          6.3         2.5          5.0         1.9     virginica 0
148          6.5         3.0          5.2         2.0     virginica 1
149          6.2         3.4          5.4         2.3 missing_value 1
150          5.9         3.0          5.1         1.8     virginica 1
  Variable_name imputed_missing
1  Sepal.Length          -99999
2   Sepal.Width          -99999
3  Petal.Length          -99999
4   Petal.Width          -99999
5       Species   missing_value

scorecardModelUtils documentation built on May 2, 2019, 9:59 a.m.