Nothing
tknz_sent()
and preprocess()
now have a different implementation on
Windows and UNIX OSs, respectively (since the previous C++ implementation has
impredictable behaviour on Windows, see #30). This fix also included minor
changes in the tknz_sent()
output, in some corner cases (e.g. tknz_sent("")
now returns character(0)
, wheareas it used to return ""
).perplexity()
gets a new argument exp
that allows to return the
cross-entropy per word, rather than perplexity (its exponential).perplexity.character()
gets a new argument detailed
that allows to return, alongside with the total perplexity of the input document, also the
cross-entropies and word lengths of individual sentences. Closes #28.?kgram_freqs
.R
requirements 3.5 -> 4.0
.SystemRequirements: C++11
(see this tidyverse blog post)verbose
arguments now default to FALSE
.probability()
, perplexity()
and sample_sentences()
are restricted to
accept only language_model
class objects as their model
argument.as_dictionary(NULL)
now returns an empty dictionary
..preprocess
and .tknz_sent
arguments to be ignored in process_sentences()
.max_lines
and batch_size
arguments in kgram_freqs.connection()
.dictionary
.dictionary()
with batch processing and
non-trivial size constraints on vocabulary size.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.