Nothing
FIXED
useHash = TRUE from sample.int() inside should_approx() to match changes made in R 4.2.0. sample.int() now decided for itself whether to useHash or not. Closes https://github.com/DesiQuintans/siftr/issues/16.ADDED
exclude column to factor files.CHANGED
ordered column, to conform to what real-world users would do.ADDED
.dist is used with an orderless search (because .dist is ignored in those cases).FIXED
NAs give a correct peek of their non-NA contents, rather than being reported as NA entirely.CHANGED
.dist in the case of no matches is now hidden for orderless searches.siftr to avoid name collision with existing sift package on CRAN that I somehow missed.Initial CRAN submission.
ADDED
options_sift() gets a new option: sift_peeklength. This controls the approximate length of the rand_unique entries in the data dictionary, i.e. a list of unique values in each column. This "full peek" is used as part of the "haystack" that actually gets searched by sift(). It defaults to 3000 characters, but the final length increases when separators are added. Previously, a length limit of only 500 characters was hard-coded in. 3000 characters is about the length of a 1-page Word document at default settings.FIXED
class() (e.g. "labelled" and "integer") would create a dictionary with two entries per dataframe column.has_class() can deal with multi-classed variables now.CHANGED
some_uniques() has short-circuit routes for datatypes that don't need the full "random sampling to get a list of its unique values" treatment. So far this is: Factors, Logicals, and Numerics.| from comma , because some data may use commas within string values.save_dictionary() generates factor files for each unique factor in the dataframe now, according to the tsv2label spec.mtcars_lab) now has a list column added.should_approx() now uses sample.int() with the useHash argument, which performed better than sample().ADDED
save_dictionary() allows you to save a data dictionary in a form that my other package, tsv2label, will accept. Closes #11.options_sift() prints the status of all options when invoked with no arguments. Closes #12.FIXED
n = sift_limit results.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.