Description Usage Arguments Details Value
View source: R/3-model-matrix.R
Computes the model matrix, which compiles together expectedness values from the PPM analyses as well as polynomial expansions of the continuous features.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | compute_model_matrix(
parent_dir,
max_sample = Inf,
sample_seed = 1,
poly_degree = 4L,
na_val = 0,
filter_corpus = NULL,
ltm = TRUE,
viewpoint_dir = file.path(parent_dir, "0-viewpoints"),
ppm_dir = file.path(parent_dir, "1-ppm"),
output_dir = file.path(parent_dir, "2-model-matrix"),
viewpoints = read_viewpoints(viewpoint_dir),
seq_test = list_seq_test(ppm_dir),
allow_repeats = FALSE
)
|
parent_dir |
(Character scalar)
The parent directory for the output files, shared with functions such as
|
max_sample |
(Numeric scalar)
Maximum number of events to sample for the model matrix,
defaults to |
sample_seed |
(Integer scalar) Random seed to make the downsampling reproducible. |
poly_degree |
(Integer scalar) Degree of the polynomials to compute for the continuous features. |
na_val |
(Numeric scalar) Value to use to code for NA in the model matrix. The statistical analyses are mostly unaffected by this value. |
filter_corpus |
(NULL or a function)
An optional function to apply to the corpus to determine which
events should be retained in the model matrix.
The function is applied to the corpus object saved as |
ltm |
(Logical scalar, default = |
viewpoint_dir |
(Character scalar)
The directory for the already-generated
output files from |
ppm_dir |
(Character scalar)
The directory for the already-generated
output files from |
output_dir |
(Character scalar) The output directory for the model matrix. Will be created if it doesn't exist already. |
viewpoints |
Character vector listing the viewpoints to be included in the model matrix.
By default this list is read from |
seq_test |
Integer vector identifying which sequences should be sampled from for
constructing the model matrix, indexing into the |
allow_repeats |
(Logical scalar) Whether repeated chords are theoretically permitted in the chord sequences. It is recommended to remove such repetitions before modelling. |
The following routines should have been run already:
compute_viewpoints
compute_ppm_analyses
The primary output is written to disk in the dir
directory.
The model matrix provides metafeature values
(i.e. expectedness values for discrete features
and polynomial values for continuous features)
over the entire chord alphabet at every location in seq_test
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.