Nothing
Fix encoding issue for non-ASCII characters to work with fastmatch
Add functionality
- perm_tester
for Monte Carlo Permutation Tests for Model P-Values
- rancor_builder
creates random corpus based on provided term probabilities
- rancors_builder
creates multiple random corpora]
Include additional tests, updated documentation and vignettes
Working on an encoding error in fastmatch
which shows inconsistent behavior with non-ASCII characters. This dev version provides a temporary fix.
doc_centrality
calculates four graph-based centrality metrics using DTMsdoc_similarty
calculates four document similarity measures using DTMsget_regions
, instead of mlpackseq_builder
creates a token-integer sequence representationdtm_builder
includes an option to return a dense base R matrixdtm_stopper
includes an option to remove based on a terms rank (e.g., top 10), stopping based on count and proportion are now two separate optionsfind_transformation()
to norm, center, and align matricesfind_projection()
finds the projection matrix onto a vectorfind_rejection()
finds the rejection matrix away from a vectordtm_melter()
quickly turns a DTM into a triplet dataframe (doc_id, term, count)get_centroid()
naming (limits to single word for names)dtm_stopper()
to stop words by document or term frequenciesstop_freq
was changed to stop_termfreq
dtm_resampler()
to resample proportion and fixed N lengthsNEWS.md
file to track changes to the package.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.