Description Usage Arguments Details Value Author(s) Examples

View source: R/sample_sentences.R

Sample sentences from a language model's probability distribution.

1 | ```
sample_sentences(model, n, max_length, t = 1)
``` |

`model` |
an object of class |

`n` |
an integer. Number of sentences to sample. |

`max_length` |
an integer. Maximum length of sampled sentences. |

`t` |
a positive number. Sampling temperature (optional); see Details. |

This function samples sentences according the prescribed language model's
probability distribution, with an optional temperature parameter.
The temperature transform of a probability distribution is defined by
`p(t) = exp(log(p) / t) / Z(t)`

where `Z(t)`

is the partition
function, fixed by the normalization condition `sum(p(t)) = 1`

.

Sampling is performed word by word, using the already sampled string
as context, starting from the Begin-Of-Sentence context (i.e. `N - 1`

BOS tokens). Sampling stops either when an End-Of-Sentence token is
encountered, or when the string exceeds `max_length`

, in which case
a truncated output is returned.

A word of caution on some special smoothers: 'sbo' smoother (Stupid Backoff),
does not produce normalized continuation probabilities, but rather
continuation *scores*. Sampling is here performed by assuming that
Stupid Backoff scores are *proportional* to actual probabilities.
'ml' smoother (Maximum Likelihood) does not assign probabilities when the
k-gram count of the context is zero. When this happens, the next word is
chosen uniformly at random from the model's dictionary.

a character vector of length `n`

. Random sentences generated
from the language model's distribution.

Valerio Gherardi

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ```
# Sample sentences from 8-gram Kneser-Ney model trained on Shakespeare's
# "Much Ado About Nothing"
### Prepare the model and set seed
freqs <- kgram_freqs(much_ado, 8, .tknz_sent = tknz_sent)
model <- language_model(freqs, "kn", D = 0.75)
set.seed(840)
sample_sentences(model, n = 3, max_length = 10)
### Sampling at high temperature
sample_sentences(model, n = 3, max_length = 10, t = 100)
### Sampling at low temperature
sample_sentences(model, n = 3, max_length = 10, t = 0.01)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.