performance: Assessing the Performance of Record Linkage with a True Match...

Description Usage Arguments Details Value

View source: R/performance.R

Description

This function takes the match snapshot(s) that are outputs from vrmatch, and using a variable that shows the true match status, i.e., for instance an internally validated voter ID field, assesses the performance of the match. We can use multiple true match variables—for instance, a combination of the internal voter ID variable (field A) and the registration affidavit number variable (field B). The records will be true matches if either A or B matches. Vice versa, we can also garner true matches based on all true match variables being matched.

Usage

1
2
3
performance(date_df, path_matches = "matches",
  date_label = "date_label", ids = c("lVoterUniqueID", "sAffNumber"),
  cond = "or", inter_id = "inter_id")

Arguments

date_df

Dataframe of list of snapshots.

path_matches

Path where the match outcomes are output to. Defaults to "matches".

date_label

Labels for dates (i.e., snapshot IDs), in 'date_df'. Defaults to "date_label".

ids

A vector of true match fields. Defaults to c("lVoterUniqueID", "sAffNumber").

cond

Whether the supplied true match fields should be used with a OR condition or AND. Defaults to "or".

inter_id

Interm ID for dplyr::bind_row for purrr::output. Defaults to "inter_id".

Details

Note that this differs from fastLink::summary in that we are utilizing an externally supplied true match variable.

Value

A summary dataframe of performance assessment of all user-specified snapshot matches.


sysilviakim/voterdiffR documentation built on June 22, 2020, 6:51 p.m.