WhichNClosestTo: Build a training dataset

Description Usage Arguments Value

View source: R/active.R

Description

Build a training dataset from user input about whether records in the dataset match

Usage

1
WhichNClosestTo(n, vec, k)

Arguments

RLdata

a data frame containing the records to be matched

n.pairs.to.test

an integer corresponding to the number of pairs of records the user wants to test

variables.to.match

a vector of strings containing the variables of interest for this linkage. Default is all variables in RLdata. Can repeat variables to use different comparators on same variable.

string.comparators

a vector of strings containing the comparator to be used for each variable. Default is jarowinkler. This should be same length as variables.to.match.

current.record.ids

a vector of strings corresponding to the variable names that will contain the pairwise combinations of records. The default (which is produced using CompareUniqueCombinations) is c("CurrentRecord1", "CurrentRecord2").

standardized.variables

a vector of strings containing the names of all standardized variables. The comparison values for these variables will be averaged. The default is all factor variables in RLdata.

Value

A list with the elements

comparisons

a data frame containing the comparisons of RLdata

tested.comparisons

a data frame containing the comparison values of records the user tested

tested.data

a data frame containing the original data of records the user tested


kaylafrisoli/ActiveRL documentation built on May 20, 2019, 7:53 a.m.