Sample sentences from a language model's probability distribution.

1 | ```
sample_sentences(model, n, max_length, t = 1)
``` |

model
an object of class

n
an integer. Number of sentences to sample.

max_length
an integer. Maximum length of sampled sentences.

t
a positive number. Sampling temperature (optional); see Details.

This function samples sentences according the prescribed language model's
probability distribution, with an optional temperature parameter.
The temperature transform of a probability distribution is defined by
p(t) = exp(log(p) / t) / Z(t)

where Z(t)

is the partition
is the partition function, fixed by the normalization condition sum(p(t)) = 1.

.

Sampling is performed word by word, using the already sampled string
as context, starting from the Begin-Of-Sentence context (i.e. N - 1

BOS tokens). Sampling stops either when an End-Of-Sentence token is
encountered, or when the string exceeds max_length, in which case

, in which case
a truncated output is returned.

A word of caution on some special smoothers: 'sbo' smoother (Stupid Backoff),
does not produce normalized continuation probabilities, but rather
continuation *scores*. Sampling is here performed by assuming that
Stupid Backoff scores are *proportional* to actual probabilities.
'ml' smoother (Maximum Likelihood) does not assign probabilities when the
k-gram count of the context is zero. When this happens, the next word is
chosen uniformly at random from the model's dictionary.

a character vector of length `n`

. Random sentences generated
from the language model's distribution.

Valerio Gherardi

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ```
# Sample sentences from 8-gram Kneser-Ney model trained on Shakespeare's
# "Much Ado About Nothing"
### Prepare the model and set seed
freqs <- kgram_freqs(much_ado, 8, .tknz_sent = tknz_sent)
model <- language_model(freqs, "kn", D = 0.75)
set.seed(840)
sample_sentences(model, n = 3, max_length = 10)
### Sampling at high temperature
sample_sentences(model, n = 3, max_length = 10, t = 100)
### Sampling at low temperature
sample_sentences(model, n = 3, max_length = 10, t = 0.01)
``` |

