label | R Documentation |
label
wraps a sampling and the candidates function to make manual labelling of training data easier
label(
dat_from,
dat_to,
persid_from,
persid_to,
blockvariable,
blocktype,
N,
path,
...
)
dat_from |
a data.table |
dat_to |
a data.table |
persid_from |
string identifying the person id variable |
persid_to |
string identifying the person id variable |
blockvariable |
string identifying the blocking variable |
N |
the number of unique observations of the blocking varaible to be labelled, defaults to 500 |
... |
passed to candidates for customised blocking |
label takes a random sample from dat_from, gathers candidates from dat_to and presents them to the user to select the match or tell that there is no match
The labelling session is interactive, and the user is presented with a choice between
PersidOne of the numbers of persid_to
None
At some point a "Back" option might be added
After selecting there is an annotation step, that can be done
Cancel
Sure
Maybe
Doubtful
Ambiguous
A list containing candidate pairs to be labelled
d1 = data.table::data.table(mlast = c("jong", "smid"), mfirst = c("Jan", "Jan"), wfirst = NA, wlast = NA, settlerchildren = NA, persid = c(1:2))
d2 = data.table::data.table(mlast = c("jongh", "jong", "smit"), mfirst = c("Jan", "Dirk", "Johan"), wlast = NA, wfirst = NA, settlerchildren = NA, persid = c(1:3))
label(d1, d2, "persid", "persid", "mlast", "bigram distance", 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.