Man pages for rijpma/capelinker
Machine Learning-based Record Linkage for Historical South Africa

candidatesCreate candidate links from two datasets.
conf2texprint a confusion matrix as tex
distcalcCalculate distances between character and numeric variables
expand_indexSupplement an existing linking index with new links.
gkTwo dimensional Gaussian kernel
initialsCreate initials from a string containing names
labelManually label data
len_longest_wordcharacter length of a the longest word in a string
predict_linksPredict links
preflightCheck whether dataset is ready for linkage
rand_strings_likeGenerate random strings resembling old strings
rm_diacreticsRemove diacretics from letters
split_prefixesSplit out prefixes from surname strings
stringdist_closestCalculate closets string distance to another string in a...
uniformise_stringUniformise strings
xgbm_ffxgb.DMatrix from dataframe and formula
rijpma/capelinker documentation built on Nov. 7, 2024, 3:06 a.m.