View source: R/nearestNeighbors.R
nearestNeighborsSeparateHitMiss | R Documentation |
Find nearest neighbors of each instance using relief.method. Treat the hit and miss distributions separately to circument potential hit bias. ReliefF version makes hit/miss neighborhoods balanced. Surf and MultiSurf are still imbalanced. Used for npdr (no hits or misses specified in neighbor function).
nearestNeighborsSeparateHitMiss(
attr.mat,
pheno.vec,
nbd.method = "relieff",
nbd.metric = "manhattan",
sd.frac = 0.5,
k = 0,
neighbor.sampling = "none",
att_to_remove = c(),
fast.dist = FALSE,
dopar.nn = FALSE
)
attr.mat |
m x p matrix of m instances and p attributes |
pheno.vec |
vector of class values for m instances |
nbd.method |
neighborhood method |
nbd.metric |
used in npdrDistances for distance matrix between instances, default: |
sd.frac |
multiplier of the standard deviation from the mean distances, subtracted from mean distance to create for SURF or multiSURF radius. The multiSURF default "dead-band radius" is sd.frac=0.5: mean - sd/2 |
k |
number of constant nearest hits/misses for |
neighbor.sampling |
"none" or |
att_to_remove |
attributes for removal (possible confounders) from the distance matrix calculation. |
fast.dist |
whether or not distance is computed by faster algorithm in wordspace, default as F |
dopar.nn |
whether or not neighborhood is computed in parallel, default as F |
Ri_NN.idxmat, matrix of Ri's (first column) and their NN's (second column)
# reliefF (fixed-k) neighborhood using default k equal to theoretical surf expected value
# One can change the theoretical value by changing sd.frac (default 0.5)
neighbor.pairs.idx <- nearestNeighborsSeparateHitMiss(
predictors.mat, case.control.3sets$train$class, # need attributes and pheno
nbd.method = "relieff", nbd.metric = "manhattan",
sd.frac = .5, k = 0
)
head(neighbor.pairs.idx)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.