fromList: Generate pairwise comparisons from list items

Description Usage Arguments Value

Description

Given a list item indexed by vendor hashes (such as 'titleTokens', ..., generated using the functions in 'step2_anytime.R'), generate Jaccard similarities / return whether or not their intersections are non-empty (used for PGPs).

Usage

1
fromList(pairsOfInterest, listName, functionName)

Arguments

pairsOfInterest

dataframe with columns 'hash1', 'hash2' referring to vendor hashes for the comparison

listName

e.g. 'titleTokens', 'profileTokens', 'PGPlist'

functionName

either 'jaccardSimilarity' or 'PGPmatch'. These are currently the only two options.

Value

vector the length of 'nrow(pairsOfInterest)', containing the resulting similarity scores for each pair


xhtai/heisenbrgr documentation built on June 8, 2019, 9:30 a.m.