Description Usage Arguments Value Examples
This function conducts nearest neighbor imputation with the added option
of using a sequence of neighbor values instead of picking one. One
imputed dataset is created for each value of nearest neighbors
(k
).
1 2 3 4 5 6 7 8 9 10 11 12 13 |
data_ref |
a data frame. |
data_new |
an optional data frame. If supplied, then |
cols |
columns that should be imputed and/or used to impute other columns. Supports tidy select functions (see examples). |
k_neighbors |
a numeric vector indicating how many neighbors should be used to impute missing values. |
aggregate |
a logical value. If |
fun_aggr_ctns |
a function used to aggregate neighbors for continuous
variables. If unspecified, the |
fun_aggr_intg |
a function used to aggregate neighbors for integer
values variables. If unspecified, the |
fun_aggr_catg |
a function used to aggregate neighbors for categorical
variables. If unspecified, the |
nthread |
Number of threads to use for parallelization. By default, for a dual-core machine, 2 threads are used. For any other machine n-1 cores are used so your machine doesn't freeze during a big computation. The maximum nr of threads are determined using omp_get_max_threads at C level. |
epsilon |
Computed numbers (variable ranges) smaller than eps are treated as zero |
verbose |
logical value. If |
a list of imputed datasets the same length as k_neighbors
.
1 2 3 4 5 6 7 | data(diabetes, package = 'ipa')
trn <- diabetes$missing[1:25, ]
tst <- diabetes$missing[26:50, ]
trn_imputes <- impute_nbrs(data_ref = trn, k = 1:5)
tst_imputes <- impute_nbrs(data_ref = trn, data_new = tst, k = 1:5)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.