Classify record pairs with unsupervised clustering methods.
Object of type
The classification method to use. One of
Further arguments for the classification method
A clustering algorithm is applied to find clusters in the comparison patterns. In the case of two clusters (the default), the cluster further from the origin (i.e. representing higher similarity values) is interpreted as the set of links, the other as the set of non-links.
Supported methods are:
K-means clustering, see
Bagged clustering, see
An object of class
"RecLinkResult" that represents a copy
newdata with element
rpairs$prediction, which stores
the classification result, as addendum.
Andreas Borg, Murat Sariyar
1 2 3 4 5 6
Loading required package: DBI Loading required package: RSQLite Loading required package: ff Loading required package: bit Attaching package bit package:bit (c) 2008-2012 Jens Oehlschlaegel (GPL-2) creators: bit bitwhich coercion: as.logical as.integer as.bit as.bitwhich which operator: ! & | xor != == querying: print length any all min max range sum summary bit access: length<- [ [<- [[ [[<- for more help type ?bit Attaching package: 'bit' The following object is masked from 'package:base': xor Attaching package ff - getOption("fftempdir")=="/work/tmp/tmp/RtmpmFNZKn" - getOption("ffextension")=="ff" - getOption("ffdrop")==TRUE - getOption("fffinonexit")==TRUE - getOption("ffpagesize")==65536 - getOption("ffcaching")=="mmnoflush" -- consider "ffeachflush" if your system stalls on large writes - getOption("ffbatchbytes")==16777216 -- consider a different value for tuning your system - getOption("ffmaxbytes")==536870912 -- consider a different value for tuning your system Attaching package: 'ff' The following objects are masked from 'package:bit': clone, clone.default, clone.list The following objects are masked from 'package:utils': write.csv, write.csv2 The following objects are masked from 'package:base': is.factor, is.ordered Loading required package: ffbase Attaching package: 'ffbase' The following objects are masked from 'package:ff': [.ff, [.ffdf, [<-.ff, [<-.ffdf The following objects are masked from 'package:base': %in%, table RecordLinkage library [c] IMBEI Mainz Attaching package: 'RecordLinkage' The following object is masked from 'package:ff': clone The following object is masked from 'package:bit': clone Committee Member: 1(1) 2(1) 3(1) 4(1) 5(1) 6(1) 7(1) 8(1) 9(1) 10(1) Computing Hierarchical Clustering Deduplication Data Set 500 records 18643 record pairs 50 matches 18593 non-matches 0 pairs with unknown status 82 links detected 0 possible links detected 18561 non-links detected alpha error: 0.340000 beta error: 0.002635 accuracy: 0.996460 Classification table: classification true status N P L FALSE 18544 0 49 TRUE 17 0 33
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.