Description Usage Arguments Value Examples
View source: R/stylest_select_vocab.R
Selects optimal vocabulary quantile(s) for model fitting using performance on predicting out-of-sampletexts.
1 2 3 4 5 6 7 8 9 10 11 12 |
x |
Corpus as text vector. May be a |
speaker |
Vector of speaker labels. Should be the same length as
|
filter |
if not |
smooth |
value for smoothing. Defaults to 0.5 |
nfold |
Number of folds for cross-validation. Defaults to 5 |
cutoff_pcts |
Vector of cutoff percentages to test. Defaults to
|
cutoffs_term_weights |
Named list of dataframes of term weights,
where the names correspond to the |
fill_method |
if |
fill_weight |
numeric value to fill in as weight for any term
which does not have a weight specified in |
weight_varname |
Name of the column in each term_weights dataframe containing
the weights, default= |
List of: best cutoff percent with the best speaker classification rate; cutoff percentages that were tested; matrix of the mean percentage of incorrectly identified speakers for each cutoff percent and fold; and the number of folds for cross-validation
1 2 3 4 5 6 | ## Not run:
data(novels_excerpts)
stylest_select_vocab(novels_excerpts$text, novels_excerpts$author, cutoff_pcts = c(50, 90))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.