nearestNeighbors: nearestNeighbors

View source: R/nearestNeighbors.R

nearestNeighborsR Documentation

nearestNeighbors

Description

Find nearest neighbors of each instance using relief.method Used for npdr (no hits or misses specified in neighbor function).

Usage

nearestNeighbors(
  attr.mat,
  nbd.method = "multisurf",
  nbd.metric = "manhattan",
  sd.vec = NULL,
  sd.frac = 0.5,
  k = 0,
  neighbor.sampling = "none",
  att_to_remove = c(),
  fast.dist = FALSE,
  dopar.nn = FALSE
)

Arguments

attr.mat

m x p matrix of m instances and p attributes. Needs to be a distance matrix instead if nbd.metric is precomputed.

nbd.method

neighborhood method multisurf or surf (no k) or relieff (require k).

nbd.metric

used in npdrDistances for distance matrix between instances, default to manhattan (numeric input). Argument can be precomputed if user-supplied distance matrix, in which case, attr.mat needs to be dist matrix

sd.vec

vector of standard deviations

sd.frac

multiplier of the standard deviation from the mean distances, subtracted from mean distance to create for SURF or multiSURF radius. The multiSURF default "dead-band radius" is sd.frac=0.5: mean - sd/2

k

number of constant nearest hits/misses for "relieff" (fixed k). The default k=0 means use the expected SURF theoretical k with sd.frac (.5 by default) for relieff nbd.

neighbor.sampling

"none" or "unique" if you want to return only unique neighbor pairs

att_to_remove

attributes for removal (possible confounders) from the distance matrix calculation.

fast.dist

whether or not distance is computed by faster algorithm in wordspace, default as F

dopar.nn

whether or not neighborhood is computed in parallel, default as F

Value

Ri_NN.idxmat, matrix of Ri's (first column) and their NN's (second column)

Examples


# multisurf neighborhood with sigma/2 (sd.frac=0.5) "dead-band" boundary
neighbor.pairs.idx <- nearestNeighbors(
  predictors.mat,
  nbd.method = "multisurf",
  nbd.metric = "manhattan",
  sd.frac = 0.5
)
head(neighbor.pairs.idx)

# reliefF (fixed-k) neighborhood using default `k` equal to
# theoretical surf expected value.
# One can change the theoretical value by changing sd.frac (default 0.5).
neighbor.pairs.idx <- nearestNeighbors(
  predictors.mat,
  nbd.method = "relieff",
  nbd.metric = "manhattan"
)
head(neighbor.pairs.idx)

# reliefF (fixed-k) neighborhood with a user-specified k
neighbor.pairs.idx <- nearestNeighbors(
  predictors.mat,
  nbd.method = "relieff",
  nbd.metric = "manhattan",
  k = 10
)
head(neighbor.pairs.idx)

insilico/npdro documentation built on July 1, 2023, 2:56 p.m.