returnMatches: return matching indices

Description Usage Arguments Value Examples

Description

This function collapses/sums the individual matrices from different feature columns together.

Right now, the weights are not actually implemented (multiplying each matrix by 1 minus its weight did not give distances that clearly stratified between true and false patient matches.)

The matrices are simply summed together and then divided by the total number of matrices fed in to provide a single output distance matrix, where each index is on a scale of 0 to 1, with 0 being a perfect match (zero distance between the two rows.)

Usage

1
returnMatches(nRowD1, nRowD2, distMatrix, thresh)

Arguments

nRowD1, nRowD2

the number of rows for both datasets (ordered as they have been for all functions)

distMatrix

a distance matrix

thresh

distance threshold (smaller means more conservative matches)

Value

list where each index are the row indices for data rows in dataset 1 (d1), and 2 (d2) that correspond to the same person

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# in this example, we will match one data set with the name distances
set.seed(9)
x <- data.frame(x = letters, y = LETTERS, z = 1:26)
x <- x[sample(nrow(x), 10, replace = TRUE), ]
y <- nameDists(x)
returnMatches(nrow(x), NULL, y, 0.8)

#in this example, the same row numbers in both datasets match
caseIDVector <- c("AB-10-1", "AB-10-5", "AB-10_1")
distMatrix <- adist(c(caseIDVector, caseIDVector), c(caseIDVector, caseIDVector))
returnMatches(3, 3, distMatrix, thresh = 0.5)

Hackout3/epimatch documentation built on May 6, 2019, 9:48 p.m.