threshold_distance: threshold_distance

View source: R/distance.r

threshold_distanceR Documentation

threshold_distance

Description

Computes the distance between rows and returns those that fall below threshold

Usage

threshold_distance(
  data,
  threshold,
  cols = c("x", "y"),
  id_col = "ID",
  extra_columns = NULL,
  as_dataframe = FALSE,
  check_id = TRUE,
  distance_type = c("euclidean", "haversine")
)

Arguments

data

data.frame of data to compute distance of

threshold

Maximum distance to return

cols

Names of columns of numeric data. The data will first be sorted on the first of these.

id_col

Name of column holding ID data

extra_columns

Names of other columns to expand into the results based on indices. Two new elements will be made for each, one for the i index and one for the j index.

as_dataframe

logical if a list (default) or data.frame should be returned

check_id

Whether the ID variable should be checked for inclusion

distance_type

What distance function to use

Details

Computes the distance between rows and returns those that fall below threshold. If two rows have the same ID, they will not be compared and the row-pairs will not be returned.

Value

Either a list or data.frame showing which IDs matched with other IDs, the distance between them and the rows numbers where the pairs occured.

Author(s)

Jared P. Lander

Examples

thedf <- data.frame(
ID=rep(LETTERS[1:3], length.out=10),
x=sample(10),
y=sample(10),
extra1=sample(letters, size=10),
extra2=sample(letters, size=10),
extra3=sample(10)
)

threshold_distance(thedf, threshold=3, as_dataframe=FALSE)
threshold_distance(thedf, threshold=3, as_dataframe=TRUE)
threshold_distance(thedf, threshold=3, as_dataframe=TRUE, check_id=FALSE)

jaredlander/distancethreshold documentation built on June 10, 2025, 1:56 a.m.