merge_plus: merge_plus

Description Usage Arguments Value See Also Examples

View source: R/merge_plus.R

Description

merge two datasets, plus.

Usage

1
2
3
4
5
merge_plus(data1, data2, by = NULL, by.x = by, by.y = by,
  suffixes = c(".x", ".y"), check_merge = TRUE, unique_key_1, unique_key_2,
  match_type = "exact", amatch.args = list(method = "jw", p = 0.1, maxDist =
  0.05, matchNA = FALSE), score_settings = NULL, filter = NULL,
  filter.args = list(), evaluate = match_evaluate, evaluate.args = list())

Arguments

data1

data.frame. First to-merge dataset.

data2

data.frame. Second to-merge dataset.

by

character string. Variables to merge on (common across data 1 and data 2). See merge

by.x

character string. Variable to merge on in data1. See merge

by.y

character string. Variable to merge on in data2. See merge

suffixes

character vector with length==2. Suffix to add to like named variables after the merge. See merge

check_merge

logical. Checks that your unique_keys are indeed unique, and prevents merge from running if merge would result in data.frames larger than 5 million rows

unique_key_1

character vector. Primary key of data1 that uniquely identifies each row (can be multiple fields)

unique_key_2

character vector. Primary key of data2 that uniquely identifies each row (can be multiple fields)

score_settings

list. score settings. See vingette matchscore

filter

function or numeric. Filters a merged data1-data2 dataset. If a function, should take in a data.frame (data1 and data2 merged by name1 and name2) and spit out a trimmed verion of the data.frame (fewer rows). Think of this function as applying other conditions to matches, other than a match by name. The first argument of filter should be the data.frame. If numeric, will drop all observations with a matchscore lower than or equal to filter.

filter.args

list. Arguments passed to filter, if a function

evaluate

Function to evalute merge_plus output.

evaluate.args

ist. Arguments passed to evaluate

match_type.

string. If 'exact', match is exact, if 'fuzzy', match is fuzzy.

amatch.args.

additional arguments for amatch, to be used if match_type = 'fuzzy'. Suggested defaults provided. (see amatch, method='jw')

Value

list with matches, filtered matches (if applicable), data1 and data2 minus matches, and match evaluation

See Also

match_evaluate

Examples

1
2
3
4
5
6
x = data.frame('id'=1:5,'name'=c('a','b','c','d','d'), 'amount' = 101:105)
y = data.frame('id'=6:10,'name'=c('b','c','d','e','f'), 'amount' = rep(102,5))
merge_plus(x,y,by='name',unique_key_1='id','unique_key_2'='id') 
merge_plus(x,y,by='name',unique_key_1='id','unique_key_2'='id', matchscore_settings = list('amount'=list('compare_type'='difference',weight=1)), filter=.5)   

                

mfriedri12/fedmatch documentation built on Aug. 4, 2017, 7:41 a.m.