new_ppm_decay: Create decay-based PPM model
In pmcharrison/ppm: Prediction by Partial Matching

Description Usage Arguments Details Value References See Also

View source: R/new-model.R

Creates a decay-based PPM model.

new_ppm_decay(
  alphabet_size,
  order_bound = 10L,
  ltm_weight = 1,
  ltm_half_life = 10,
  ltm_asymptote = 0,
  noise = 0,
  stm_weight = 1,
  stm_duration = 0,
  buffer_weight = 1,
  buffer_length_time = 0,
  buffer_length_items = 0L,
  only_learn_from_buffer = FALSE,
  only_predict_from_buffer = FALSE,
  seed = sample.int(.Machine$integer.max, 1),
  debug_smooth = FALSE,
  debug_decay = FALSE,
  alphabet_levels = character()
)

`alphabet_size`	(Integerish scalar) The size of the alphabet upon which the model will be trained and tested. If not provided, this will be taken as `length(alphabet_levels)`.
`order_bound`	(Integerish scalar) The model's Markov order bound.
`ltm_weight`	(Numeric scalar) w_2, initial weight in the long-term memory phase.
`ltm_half_life`	(Numeric scalar) t_2, half life of the long-term memory phase. Must be greater than zero.
`ltm_asymptote`	(Numeric scalar) w_∞, asymptotic weight as time tends to infinity.
`noise`	(Numeric scalar) σ_ε, scale parameter for the retrieval noise distribution.
`stm_weight`	(Numeric scalar) w_1, initial weight in the short-term memory phase.
`stm_duration`	(Numeric scalar) t_1, temporal duration of the short-term memory phase, in seconds.
`buffer_weight`	(Numeric scalar) w_0, weight during the buffer phase.
`buffer_length_time`	(Numeric scalar) n_b, the model's temporal buffer capacity.
`buffer_length_items`	(Integerish scalar) t_b, the model's itemwise buffer capacity.
`only_learn_from_buffer`	(Logical scalar) If TRUE, then n-grams are only learned if they fit within the memory buffer. The default value is `FALSE`.
`only_predict_from_buffer`	(Logical scalar) If TRUE, then the context used for prediction is limited by the memory buffer. Specifically, for a context to be used for prediction, the first symbol within that context must still be within the buffer at the point immediately before the predicted event occurs. The default value is `FALSE`.
`seed`	Random seed for prediction generation. By default this is linked with R's random seed, such that reproducible behaviour can be ensured as usual with the `set.seed` function.
`debug_smooth`	(Logical scalar) Whether to print (currently rather messy and ad hoc) debug output for the smoothing mechanism.
`debug_decay`	(Logical scalar) Whether to print (currently rather messy and ad hoc) debug output for the decay mechanism.
`alphabet_levels`	(Character vector) Optional vector of levels for the alphabet. If provided, these will be used to define factor levels for the output.

Decay-based PPM models generalise the PPM algorithm to incorporate memory decay, where the effective counts of observed n-grams decrease over time to reflect processes of auditory memory.

The weight of a given n-gram over time is determined by a decay kernel. This decay kernel is parametrised by the arguments w_0, w_1, w_2, w_∞, n_b, t_b, t_1, t_2, σ_ε (see above). These parameters combine to define a decay kernel of the following form:

The decay kernel has three phases:

Buffer (yellow);
Short-term memory (red);
Long-term mermory (blue).

While within the buffer, the n-gram has weight w_0. The buffer has limited temporal and itemwise capacity. In particular, an n-gram will leave the buffer once one of two conditions is satisfied:

A set amount of time, t_b, elapses since the first symbol in the n-gram was observed, or
The buffer exceeds the number of symbols it can store, n_b, and the n-gram no longer fits completely in the buffer, having been displaced by new symbols.

There are some subtleties about how this actually works in practice, refer to \insertCiteHarrison2020;textualppm for details.

The second phase, short-term memory, begins as soon as the buffer phase completes. It has a fixed temporal duration of t_1. At the beginning of this phase, the n-gram has weight w_1; during this phase, its weight decays exponentially until it reaches w_2 at timepoint t_2.

The second phase, long-term memory, begins as soon as the short-term memory phase completes. It has an unlimited temporal duration. At the beginning of this phase, the n-gram has weight w_2; during this phase, its weight decays exponentially to an asymptote of w_∞.

The model optionally implements Gaussian noise at the weight retrieval stage. This Gaussian is parametrised by the standard deviation parameter σ_ε. See \insertCiteHarrison2020;textualppm for details.

This function supports simpler decay functions with fewer stages; in fact, the default parameters define a one-stage decay function, corresponding to a simple exponential decay with a half life of 10 s. To enable the buffer, buffer_length_time and buffer_length_items should be made non-zero, and only_learn_from_buffer and only_predict_from_buffer should be set to TRUE. Likewise, retrieval noise is enabled by setting noise to a non-zero value, and the short-term memory phase is enabled by setting stm_duration to a non-zero value.

The names of the 'short-term memory' and 'long-term memory' phases should be considered arbitrary in this context; they do not necessarily correspond directly to their psychological namesakes, but are instead simply terms of convenience.

The resulting PPM-Decay model uses interpolated smoothing with escape method A, and explicitly disables exclusion and update exclusion. See \insertCiteHarrison2020;textualppm for details.

A PPM-decay model object. These objects have reference semantics.

\insertAllCited

new_ppm_simple, model_seq.

pmcharrison/ppm documentation built on June 4, 2021, 9:45 a.m.