new_ppm_simple: Create simple PPM model

Description Usage Arguments Value Note References See Also

View source: R/new-model.R

Description

Creates a simple PPM model, that is, a PPM model without any non-traditional features such as memory decay.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
new_ppm_simple(
  alphabet_size,
  order_bound = 10L,
  shortest_deterministic = TRUE,
  exclusion = TRUE,
  update_exclusion = TRUE,
  escape = "c",
  debug_smooth = FALSE,
  alphabet_levels = character()
)

Arguments

alphabet_size

(Integerish scalar) The size of the alphabet upon which the model will be trained and tested. If not provided, this will be taken as length(alphabet_levels).

order_bound

(Integerish scalar) The model's Markov order bound. For example, an order bound of two means that the model makes predictions based on the two preceding symbols.

shortest_deterministic

(Logical scalar) If TRUE, the model will 'select' the shortest available order that provides a deterministic prediction, if such an order exists, otherwise defaulting to the longest available order. For a given prediction, if this rule results in a lower model order than would have otherwise been selected, then full counts (not update-excluded counts) will be used for the highest model order (but not for lower model orders). This behaviour matches the implementations of PPM* in \insertCitePearce2005;textualppm and \insertCiteBunton1996;textualppm.

exclusion

(Logical scalar) If TRUE, implements exclusion as defined in \insertCitePearce2005;textualppm and \insertCiteBunton1996;textualppm.

update_exclusion

(Logical scalar) If TRUE, implements update exclusion as defined in \insertCitePearce2005;textualppm and \insertCiteBunton1996;textualppm.

escape

(Character scalar) Takes values 'a', 'b', 'c', 'd', or 'ax', corresponding to the eponymous escape methods in \insertCitePearce2005;textualppm. Note that there is a mistake in the definition of escape method "AX" in \insertCitePearce2005;textualppm; the denominator of lambda needs to have 1 added. This is what we implement here. Note that Pearce's LISP implementation correctly adds 1 here, like us.

debug_smooth

(Logical scalar) Whether to print (currently rather messy and ad hoc) debug output for smoothing.

alphabet_levels

(Character vector) Optional vector of levels for the alphabet. If provided, these will be used to define factor levels for the output.

Value

A PPM model object. These objects have reference semantics.

Note

The implementation does not scale well to very large order bounds (> 50).

References

\insertAllCited

See Also

new_ppm_decay, model_seq.


pmcharrison/ppm documentation built on June 4, 2021, 9:45 a.m.