BlockRlData: Block a record linkage dataset

Description Usage Arguments Value Examples

View source: R/blocking.R

Description

Block a record linkage dataset by substrings of the variables in the dataset

Usage

1
2
BlockRlData(RLdata, var.names, n.chars = NULL, unique.ids = NULL,
  pre.block.record = c(TRUE, FALSE))

Arguments

RLdata

a data frame containing the records to be matched

var.names

a vector of strings containing the variable names you want to block by

n.chars

a vector of integers corresponding to the number of the characters you want to compare in each variable of var.names

unique.ids

a vector containing the true unique identifiers of the records in RLdata. It should be of length nrow(RLdata)

Value

A list containing blocking information and the blocked data and ids

BlockInfo

a list of blocking information: blocks, factors, reduction.ratio

DataSplit

a list of datasets corresponding to each block

IdSplit

a list of vectors containing the unique ids corresponding to each block

Examples

1
2
3
BlockBySubstr(iris, "Species") #identifies 2 blocks
BlockBySubstr(iris, "Species", 2) #identifies 3 blocks
BlockBySubstr(iris, c("Species", "Sepal.Length"), c(2,1)) #identifies 3 blocks

kaylafrisoli/ActiveRL documentation built on May 20, 2019, 7:53 a.m.