View source: R/preprocessing_imputation.R
preprocessing_imputation | R Documentation |
`median-other`
The numeric features are imputed with median value,
whereas the categorical ones with the 'other' string. It is a fast method,
`median-frequency`
The numeric features are imputed with median value,
whereas the categorical ones with the most frequent value. It is a fast method,
`knn`
All features are imputed with KNN algorithm. It is a moderately fast method,
`mice`
All features are imputed with MICE algorithm. It is a slow method.
preprocessing_imputation(
data,
na_indicators = c(""),
imputation_method = "median-other",
k = 10,
m = 5,
verbose = FALSE
)
data |
A data source, that is one of the major R formats: data.table, data.frame, matrix, and so on. |
na_indicators |
A list containing the values that will be treated as NA indicators. By default the list is c(”). WARNING Do not include NA or NaN, as these are already checked in other criterion. |
imputation_method |
A string value indication the imputation method. The imputation method must be one of 'median-other', 'median-frequency', 'knn', or 'mice'. |
k |
An integer describing the number of nearest neighbours to use. By default set to 10. The parameter applicable only if selected ‘imputation_method' is ’knn'. |
m |
An integer describing the number of multiple imputations to use. By default set to 5. The parameter applicable only if selected ‘imputation_method' is ’mice'. |
verbose |
A logical value, if set to TRUE, provides all information about preprocessing process, if FALSE gives none. |
Imputed dataset.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.