build_multivar_settings: Build settings for multivar matching

Description Usage Arguments Value

View source: R/build_multivar_settings.R

Description

build_multivar_settings is a convenient way to build the list for the multivar settings argument in merge_plus

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
build_multivar_settings(
  logit = NULL,
  missing = FALSE,
  wgts = NULL,
  compare_type = "diff",
  blocks = NULL,
  blocks.x = NULL,
  blocks.y = NULL,
  top = 1,
  threshold = NULL,
  nthread = 1
)

Arguments

logit

a glm or lm model as a result from a logit regression on a verified dataset. See details.

missing

boolean T/F, whether or not to treat missing (NA) observations as its own binary column for each column in by. See details.

wgts

rather than a lm model, you can supply weights to calculate matchscore. Can be weights from calculate_weights.

compare_type

a vector with the same length as "by" that describes how to compare the variables. Options are "in", "indicator", "substr", "difference", "ratio", and "stringdist". See X for details.

blocks

variable present in both data sets to "block" on before computing scores. Matchscores will only be computed for observations that share a block. See details.

blocks.x

name of blocking variables in x. cannot supply both blocks and blocks.x

blocks.y

name of blocking variables in y. cannot supply both blocks and blocks.y

top

integer. Number of matches to return for each observation.

threshold

numeric. Minimum score for a match to be included in the result.

nthread

integer. Number of cores to use when computing all combinations. See parallel::makecluster()

Value

a list containing options for the 'multivar_settings' argument of merge_plus.


fedmatch documentation built on Nov. 23, 2021, 1:07 a.m.