distcalc: Calculate distances between character and numeric variables

View source: R/distances.R

distcalcR Documentation

Calculate distances between character and numeric variables

Description

distcalc calculates string distances and numeric distances

Usage

distcalc(
  dat,
  character_variables = c("mlast", "mfirst", "wfirst", "minitials", "winitials", "mprof"),
  numeric_variables = c("year")
)

Arguments

dat

a dataset. distcalc expects this dataset to follow the naming conventions of the candidates function: the variables from the original datasets have the same name and are distinguished using _from and _to suffixes.

character_variables

The names of the character variables, without the suffixes. Set to a length zero vector if no variables are to be used (c()).

numeric_variables

The names of the numeric variables, without the suffixes. Set to a length zero vector if no variables are to be used (c()).

Value

the dataset with the necessary distances to predict links. Note that reassignment of the data.table is not necessary. The original dataset is modified in place

Examples

d1 = data.table::data.table(mlast = c("jong", "smid"), persid = c(1:2))
d2 = data.table::data.table(mlast = c("jongh", "jong", "smit"), persid = c(1:3))
d1d2cnd = candidates(d1, d2)
distcalc(d1d2cnd, character_variables = "mlast", numeric_variables = c())
d1d2cnd


rijpma/capelinker documentation built on Nov. 7, 2024, 3:06 a.m.