select_neighbours: Select Subset of Rows Closest to a Specified Observation

Description Usage Arguments Details Value Examples

View source: R/select_neighbours.R

Description

Function select_neighbours selects subset of rows from data set. This is useful if data is large and we need just a sample to calculate profiles.

Usage

1
2
3
4
5
6
7
8
select_neighbours(
  observation,
  data,
  variables = NULL,
  distance = gower::gower_dist,
  n = 20,
  frac = NULL
)

Arguments

observation

single observation

data

set of observations

variables

names of variables that shall be used for calculation of distance. By default these are all variables present in data and observation

distance

the distance function, by default the gower_dist() function.

n

number of neighbors to select

frac

if n is not specified (NULL), then will be calculated as frac * number of rows in data. Either n or frac need to be specified.

Details

Note that select_neighbours() function is S3 generic. If you want to work on non standard data sources (like H2O ddf, external databases) you should overload it.

Value

a data frame with selected rows

Examples

1
2
3
4
5
6
7
library("ingredients")

new_apartment <- DALEX::apartments[1,]
small_apartments <- select_neighbours(new_apartment, DALEX::apartments_test, n = 10)

new_apartment
small_apartments

ingredients documentation built on April 10, 2021, 5:06 p.m.