| blockData | R Documentation | 
Contains functionalities for blocking two data sets on one or more variables prior to conducting a merge.
blockData(dfA, dfB, varnames, window.block, window.size,
kmeans.block, nclusters, iter.max, n.cores)
| dfA | Dataset A - to be matched to Dataset B | 
| dfB | Dataset B - to be matched to Dataset A | 
| varnames | A vector of variable names to use for blocking. Must be present in both dfA and dfB | 
| window.block | A vector of variable names indicating that the variable should be blocked using windowing blocking. Must be present in varnames. | 
| window.size | The size of the window for window blocking. Default is 1 (observations +/- 1 on the specified variable will be blocked together). | 
| kmeans.block | A vector of variable names indicating that the variable should be blocked using k-means blocking. Must be present in varnames. | 
| nclusters | Number of clusters to create with k-means. Default value is the number of clusters where the average cluster size is 100,000 observations. | 
| iter.max | Maximum number of iterations for the k-means algorithm to run. Default is 5000 | 
| n.cores | Number of cores to parallelize over. Default is NULL. | 
A list with an entry for each block. Each list entry contains two vectors — one with the indices indicating the block members in dataset A, and another containing the indices indicating the block members in dataset B.
## Not run: 
block_out <- blockData(dfA, dfB, varnames = c("city", "birthyear"))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.