epiClassify | R Documentation |
Classifies record pairs as link, non-link or possible link based on
weights computed by epiWeights
and the thresholds
passed as arguments.
epiClassify(rpairs, threshold.upper, threshold.lower = threshold.upper, ...) ## S4 method for signature 'RecLinkData' epiClassify(rpairs, threshold.upper, threshold.lower = threshold.upper) ## S4 method for signature 'RLBigData' epiClassify(rpairs, threshold.upper, threshold.lower = threshold.upper, e = 0.01, f = getFrequencies(rpairs), withProgressBar = (sink.number()==0))
rpairs |
|
threshold.upper |
A numeric value between 0 and 1. |
threshold.lower |
A numeric value between 0 and 1 lower than |
e |
Numeric vector. Estimated error rate(s). |
f |
Numeric vector. Average frequency of attribute values. |
withProgressBar |
Logical. Whether to display a progress bar. |
... |
Placeholder for optional arguments |
All record pairs with weights greater or
equal threshold.upper
are classified as links. Record pairs with
weights smaller than threshold.upper
and greater or equal
threshold.lower
are classified as possible links. All remaining
records are classified as non-links.
For the "RecLinkData"
method, weights must have been calculated
for rpairs
using epiWeights
.
A progress bar is displayed by the "RLBigData"
method only if
weights are calculated on the fly and, by default, unless output is diverted by
sink
(e.g. in a Sweave script).
For the "RecLinkData"
method, a S3 object
of class "RecLinkResult"
that represents a copy
of newdata
with element rpairs$prediction
, which stores
the classification result, as addendum.
For the "RLBigData"
method, a S4 object of class
"RLResult"
.
Andreas Borg, Murat Sariyar
epiWeights
# generate record pairs data(RLdata500) p=compare.dedup(RLdata500,strcmp=TRUE ,strcmpfun=levenshteinSim, identity=identity.RLdata500, blockfld=list("by", "bm", "bd")) # calculate weights p=epiWeights(p) # classify and show results summary(epiClassify(p,0.6))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.