wNNSel.impute: Weighted Nearest Neighbor Imputation of Missing Values using...

Description Usage Arguments Details Value See Also Examples

Description

This function imputes the missing values using user-spefied values of the tuning parameters. It also works when the samples are smaller than the covariates.

Usage

1
2
3
4
wNNSel.impute(x, k, useAll = TRUE, x.initial = NULL, x.dist = "euclidean",
  kernel = "gaussian", lambda = 0.3, impute.fn, convex = TRUE,
  method = "2", m = 2, c = 0.3, withinFolds = FALSE, folds,
  verbose = TRUE, verbose2 = FALSE)

Arguments

x

a matrix containing missing values

k

an optional, the number of nearest neighbors to use for imputation.

useAll

logical. The default is useALL=TRUE, that is, all available neighbors are used for the imputation.

x.initial

an optional. A complete data matrix e.g. using mean imputation of x. If provided, it will be used for the computation of correlations.

x.dist

distance to compute. The default is x.dist="euclidean", that uses the Euclidean distance. Set x.dist to NULL for Manhattan distance.

kernel

kernel function to be used in nearest neighbors imputation. Default kernel function is "gaussian".

lambda

scaler, a tuning parameter

impute.fn

the imputation function to run on the length k vector of values for a missing feature. Defaults to a weighted mean of the neighboring values, weighted by the specified kernel. If not specified then wNN imputation will be used by default.

convex

logical. If TRUE, selected variables are used for the computation of distance. The default is TRUE.

method

convex function, performs selection of variables. If method="1", linear function is used and the power function is used when method="2".

m

scaler, a tuning parameter required by the power function.

c

scaler, a tuning parameter required by the linear function.

withinFolds

logical. Use only if the neighbors/rows belong to particular folds/groups. Default is set to FALSE.

folds

a list of vectors specifying folds/groups for neighbors. lenght of list is equal to the number of folds/groups. Each element/vector of the list indicates row indices belonging to that particular group/fold.

verbose

logical. If TRUE, prints status updates

verbose2

logical. If TRUE, prints status updates with more detail

Details

For each sample, identify missinng features. For each missing feature find the nearest neighbors which have that feature. Impute the missing value using the imputation function on the selected vector of values found from the neighbors.

Value

imputed data matrix

See Also

cv.wNNSel, wNNSel

Examples

1
2
3
4
5
6
  set.seed(3)
  x = matrix(rnorm(100),10,10)
  x.miss = x > 1
  x[x.miss] = NA
  wNNSel.impute(x)
  wNNSel.impute(x, lambda=0.5, m=2)

Example output

[1] "Computing distance matrix..."
[1] "Distance matrix complete"
             [,1]       [,2]        [,3]        [,4]       [,5]       [,6]
 [1,] -0.96193342 -0.7447816 -0.57848372  0.90062473  0.7865069  0.7268389
 [2,] -0.29252572 -1.1312186 -0.94230073  0.85177045 -0.3104631 -0.8094409
 [3,]  0.25878822 -0.7163585 -0.20372818  0.72771517 -0.4247951  0.2670851
 [4,] -1.15213189  0.2526524 -1.66647484  0.73650215 -0.7945937 -1.7372637
 [5,]  0.19578283  0.1520457 -0.48445511 -0.35212962  0.3484377 -1.4114251
 [6,]  0.03012394 -0.3076564 -0.74107266  0.70551551 -2.2654011 -0.4535512
 [7,]  0.08541773 -0.9530173 -0.80491293  0.48882153 -0.1622053 -1.0354913
 [8,] -0.33290860 -0.6482428 -1.01859623  0.03825201 -0.5260079 -0.5805169
 [9,] -1.21885742 -0.5151061 -0.07207847 -0.97928377 -0.4555460  0.9174567
[10,] -0.33939445  0.1998116 -1.13678230  0.79376123 -0.8991663 -0.7851422
             [,7]       [,8]         [,9]       [,10]
 [1,]  0.57351817 -0.0313255 -0.024515176 -0.85381845
 [2,]  0.91819621  0.4670973 -0.352298306 -0.98999433
 [3,]  0.25628727  0.2754801  0.688640044 -0.65087774
 [4,]  0.35196656  0.2673585 -0.284981169 -0.34383795
 [5,]  0.08818601  0.2318261  0.794296303 -0.39087803
 [6,] -0.48084638  0.7475925 -0.006402398 -0.07058639
 [7,] -0.41882972  0.2565972  0.219150635 -0.46205081
 [8,]  0.95511280  0.3833583 -0.886463751  0.54090827
 [9,] -1.28900661 -0.9880528  0.439760291  0.93163497
[10,]  0.18619743 -0.1568529 -0.886389751 -0.20927435
[1] "Computing distance matrix..."
[1] "Distance matrix complete"
             [,1]       [,2]        [,3]        [,4]       [,5]       [,6]
 [1,] -0.96193342 -0.7447816 -0.57848372  0.90062473  0.7865069  0.7268389
 [2,] -0.29252572 -1.1312186 -0.94230073  0.85177045 -0.3104631 -0.8094409
 [3,]  0.25878822 -0.7163585 -0.20372818  0.72771517 -0.4535578  0.2670851
 [4,] -1.15213189  0.2526524 -1.66647484  0.73650215 -0.7945937 -1.7372637
 [5,]  0.19578283  0.1520457 -0.48445511 -0.35212962  0.3484377 -1.4114251
 [6,]  0.03012394 -0.3076564 -0.74107266  0.70551551 -2.2654011 -0.4535512
 [7,]  0.08541773 -0.9530173 -0.77147803  0.43836195 -0.1622053 -1.0354913
 [8,] -0.35832845 -0.6482428 -0.87243770  0.03825201 -0.4898775 -0.5296372
 [9,] -1.21885742 -0.4692480 -0.07207847 -0.97928377 -0.4555460  0.9174567
[10,] -0.36438973  0.1998116 -1.13678230  0.79376123 -0.8991663 -0.7851422
            [,7]       [,8]         [,9]       [,10]
 [1,]  0.5735182 -0.0313255 -0.013295527 -0.85381845
 [2,]  0.9181962  0.4670973 -0.352298306 -0.98999433
 [3,]  0.2562873  0.1966778  0.688640044 -0.65087774
 [4,]  0.3519666  0.2673585 -0.122058930 -0.31127665
 [5,]  0.1109909  0.2318261  0.794296303 -0.39087803
 [6,] -0.4808464  0.7475925 -0.006402398 -0.07058639
 [7,] -0.4188297  0.1940539  0.219150635 -0.46205081
 [8,]  0.9551128  0.3833583 -0.886463751  0.54090827
 [9,] -1.2890066 -0.9880528  0.439760291  0.93163497
[10,]  0.1861974 -0.1568529 -0.886389751 -0.20927435

wNNSel documentation built on May 2, 2019, 2:49 p.m.