findMatch: findMatch
In sigbertklinke/findMatch: Matches observations based on strings with the Levensthein distance

Description Usage Arguments Details Value Examples

View source: R/findMatch.R

Finds matches between two or more data sets based on a text variable (code or e-mail) based on Levensthein distances. For a detailed application see the vignette.

findMatch(data, ...)

## Default S3 method:
findMatch(data, vars, dmax = 3, exclude = c("", "."),
  ignore.case = FALSE, unique.id = NULL, output = 50,
  cmpfunc = NULL, ...)

`data`	list of data frames
`...`	further parameters for cmp
`vars`	vector of variables. One for each data frame.
`dmax`	maximal levensthein distance for matching in text variables $l(t_i1,tj2]<dmax$), defaults to `3`
`exclude`	entries to be excluded from the unique values, defaults to `c('', '.')`
`ignore.case`	if FALSE, the uniques values are case sensitive and if TRUE, case is ignored
`unique.id`	vector of variables which contain a unique ID over all data sets. If not given then `filename:lineno` will be used.
`output`	number of observation to analyse before a progress information is displayed
`cmpfunc`	function for comparison of strings of form `fun(x, y, ignore.case, ...)` (default: `adist`)

The result consists of a list with three elements

line: a matrix with the line numbers of the matching observations
idn: a matrix with the common ID ZDV and the original text variables in the data sets
leven: a matrix with the levenshtein distance between the common ID and the original text variables in the data sets

a list structure with possibly matched observations

set.seed(0)
# create two data sets where the second consists of
# 200 obs. only in t1, 200 obs. in t1 and t2 and
# 100 obs. only in t2
n <- list(c(200, 1), c(200, 1, 2), c(100, 2))
x <- generateTestData(n)
# match by code
match <- findMatch(x, c('code', 'code'))
head(match)
summary(match)

sigbertklinke/findMatch documentation built on July 12, 2019, 9:22 a.m.

sigbertklinke/findMatch index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sigbertklinke/findMatch
Matches observations based on strings with the Levensthein distance

findMatch: findMatch
In sigbertklinke/findMatch: Matches observations based on strings with the Levensthein distance

Description

Usage

Arguments

Details

Value

Examples

Related to findMatch in sigbertklinke/findMatch...

R Package Documentation

Browse R Packages

We want your feedback!

sigbertklinke/findMatch Matches observations based on strings with the Levensthein distance

findMatch: findMatch In sigbertklinke/findMatch: Matches observations based on strings with the Levensthein distance

Description

Usage

Arguments

Details

Value

Examples

Related to findMatch in sigbertklinke/findMatch...

R Package Documentation

Browse R Packages

We want your feedback!

sigbertklinke/findMatch
Matches observations based on strings with the Levensthein distance

findMatch: findMatch
In sigbertklinke/findMatch: Matches observations based on strings with the Levensthein distance