Description Usage Arguments Details Value Examples
Find duplicates in one or two data sets
1 2 | matchEpiData(dat1, dat2 = NULL, funlist = list(), thresh = 0.05,
giveWeight = FALSE)
|
dat1 |
An input linelist |
dat2 |
An optional extra linelist |
funlist |
A list containing lists containing:
|
thresh |
a threshold below which to consider two rows nearly identical. |
giveWeight |
a logical parameter indicating whether or not the output should be a list of weights or indices (default). |
this function will take in one or two data sets, a list of functions
to apply to specific columns of the data set, and a threshold to determine
what is a match. It will return a list from returnMatches
where each element represents a different potential match. Within each
element, there is a two-element list where each contains either indices or
weights for each sample that matched below the threshold.
something
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ## Loading Data
indata <- system.file("files", package = "epimatch")
indata <- dir(indata, full.names = TRUE)
x <- lapply(indata, read.csv, stringsAsFactors = FALSE)
names(x) <- basename(indata)
# We will use one data set from the case information and lab results
case <- x[["CaseInformationForm.csv"]]
lab <- x[["LaboratoryResultsForm7.csv"]]
# This will get all of the indices that match the ID and Names with a
# threshold of 0.25
res <- matchEpiData(dat1 = case,
dat2 = lab,
funlist = list(
list(d1vars = "ID",
d2vars = "ID",
fun = "nameDists",
extraparams = NULL,
weight = 1),
list(d1vars = c("Surname", "OtherNames"),
d2vars = c("SurnameLab", "OtherNameLab"),
fun = "nameDists",
extraparams = NULL,
weight = 0.5)
),
thresh = 0.25)
# List of indices
res
# Printing out the matching names in decreasing order of matching
invisible(lapply(res, function(i) {
print(case[i$d1, c("Surname", "OtherNames")])
print(lab[i$d2, c("SurnameLab", "OtherNameLab")])
cat("\n\t--------\n")
}))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.