Duplicates
used to be a base R reference class (defined by
setRefClass()
), it is now an R6 class (defined by R6::R6Class()
).
Precondition to use the functionality of roxygen2 for documenting R6 classes.Duplicates
class has been dropped: The
matrix may be bulky and is used only temporarily. No need to store it beyond
duplicate detection workflow to save memory space.whatToCompare
and similarityMatrix
dropped from Duplicates class to
improve memory efficiency.Duplicates$detectDuplicates()
is Duplicates$detect()
now.Duplicates$makeAnnotation()
renamed as Duplicates$annotate()
.corpus
when initializing Duplicates
class.Duplicates$detectDuplicates()
now has arguments n
(passed into
polmineR::ngrams) and character_selection
. Values were hard-coded previously.
Default values are aligned with Kliche et al. 2014.CQI$struc2cpos()
has been replaced by RcppCWB functionality
(get_region_matrix()
) #5.nchars()
is implemented for subcorpus
and subcorpus_bundle
objects
now and will be available for plpr_subcorpus
by inheritance #3.Duplicates
class using a duplicates()
method
has been removed #7.Duplicates$detectDuplicates()
works for n
= 0 (same-day comparisons only)partition_bundle
replaces partitionBundle
in code and documentation (#9).store()
method removed - storing mallet objects is intended usage, so code
is captured in an issue of biglda package.papply()
is removed: pbapply::pblapply()
is the consolidated state-of-the-art.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.