Description Usage Arguments Details Value
View source: R/model-dataset.R
Analyses a dataset of chord sequences by constructing and optimising a viewpoint regression model, and using this model to generate predictions for these sequences.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | model_dataset(
corpus_test,
corpus_pretrain,
output_dir,
viewpoints = hvr::hvr_viewpoints,
weights = NULL,
poly_degree = 4L,
max_iter = 500,
corpus_test_folds = list(seq_along(corpus_test)),
allow_repeats = FALSE,
max_sample = Inf,
sample_seed = 1,
stm_opt = stm_options(),
ltm_opt = ltm_options(),
na_val = 0,
perm_int = TRUE,
perm_int_seed = 1,
perm_int_reps = 5,
allow_negative_weights = FALSE
)
|
corpus_test |
Corpus of chord sequences to predict,
as created by |
corpus_pretrain |
Corpus of chord sequences with which to pretrain the model,
as created by |
output_dir |
(Character scalar) Directory in which to save the model outputs. |
viewpoints |
List of viewpoints to apply, as created by |
weights |
(NULL or numeric vector) An optional set of viewpoint regression weights; if not provided, weights will be optimised automatically. These weights should be provided as a named numeric vector in a specific order; the best way to find this format is to fit a pilot regression model with the desired viewpoint set. |
poly_degree |
(Integer scalar) Degree of the polynomials to compute for the continuous features. |
max_iter |
(Integer scalar) Maximum number of iterations for the optimisation routine. |
corpus_test_folds |
List of cross-validation folds for applying discrete viewpoint models to
the sequences in |
allow_repeats |
(Logical scalar) Whether repeated chords are theoretically permitted in the chord sequences. It is recommended to remove such repetitions before modelling. |
max_sample |
(Numeric scalar)
Maximum number of events to sample for the model matrix,
defaults to |
sample_seed |
(Integer scalar) Random seed to make the downsampling reproducible. |
stm_opt |
Options list for the short-term PPM models, as created by the function
|
ltm_opt |
Options list for the long-term PPM models, as created by the function
|
na_val |
(Numeric scalar) Value to use to code for NA in the model matrix. The statistical analyses are mostly unaffected by this value. |
perm_int |
(Logical scalar) Whether to compute permutation-based feature importances. |
perm_int_seed |
(Integer scalar) Random seed for the permutation-based feature importances. |
perm_int_reps |
(Integer scalar) Number of replicates for the permutation-based feature importances (the final estimates are averages over these replicates). |
allow_negative_weights |
(Logical scalar)
Whether negative weights should be allowed for discrete features
( |
This function wraps the following sub-routines:
compute_viewpoints
compute_ppm_analyses
compute_model_matrix
viewpoint_regression
compute_predictions
Users may wish to use these sub-routines explicitly if performing repeated analyses with different parameter settings, to save redundant computation.
Various model outputs are saved to output_dir
.
The function returns a tibble
of predicted probabilities
for the chords in corpus_test
; see
compute_predictions
for an explanation of this tibble
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.