compute_ppm_analyses: Compute PPM analyses

Description Usage Arguments Details Value References

View source: R/2-compute-ppm-analyses.R

Description

This function models discrete viewpoints using the Prediction by Partial Match (PPM) algorithm.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
compute_ppm_analyses(
  parent_dir,
  viewpoint_dir = file.path(parent_dir, "0-viewpoints"),
  output_dir = file.path(parent_dir, "1-ppm"),
  stm_opt = stm_options(),
  ltm_opt = ltm_options(),
  seq_test_folds = list(readRDS(file.path(viewpoint_dir, "about.rds"))$seq_test),
  seq_pretrain = readRDS(file.path(viewpoint_dir, "about.rds"))$seq_pretrain,
  viewpoints = readRDS(file.path(viewpoint_dir, "about.rds"))$discrete_viewpoints
)

Arguments

parent_dir

(Character scalar) The parent directory for the output files, shared with functions such as compute_viewpoints and compute_model_matrix. Ignored if all other directory arguments are manually specified.

viewpoint_dir

(Character scalar) The directory for the already-generated output files from compute_viewpoints. The default should be correct if the user used the default dir argument in compute_viewpoints.

output_dir

(Character scalar) The output directory for the PPM analyses. Will be created if it doesn't exist already.

stm_opt

Options list for the short-term PPM models, as created by the function stm_options.

ltm_opt

Options list for the long-term PPM models, as created by the function ltm_options.

seq_test_folds

List of cross-validation folds for the test sequences. Each fold is represented as an integer vector, with the integers indexing the sequences within the corpus (see compute_viewpoints). The algorithm iterates over each fold, predicting the sequences within that fold, and training the model using the combination of a) the sequences from the other folds in seq_test_folds and b) the sequences identified in seq_pretrain. By default, there is just one fold corresponding to the seq_test argument of compute_viewpoints.

seq_pretrain

(Integer vector) Sequences used to pretrain the model (in addition to any cross-validation training specified by seq_test_folds), specified as integer indices of the corpus. Defaults to the complement of seq_test as specified in compute_viewpoints.

viewpoints

List of discrete viewpoints to analyse, in the format produced by the $discrete_viewpoints slot of the about.rds file produced by compute_viewpoints. Defaults to the full set of discrete viewpoints as specified in compute_viewpoints.

Details

compute_viewpoints should be run first. By default, only sequences in seq_test (see compute_viewpoints) are modelled using PPM. The default PPM implementation corresponds to that described in \insertCitePearce2005;textualhvr.

Value

The primary output is written to disk in the dir directory. The output matrices provide raw probabilities for each event in the chord alphabet.

References

\insertAllCited
pmcharrison/hvr documentation built on April 14, 2020, 2:47 a.m.