stylest_fit: Fit speaker_model to a corpus

Description Usage Arguments Details Value Examples

View source: R/stylest_fit.R

Description

The main function in stylest, stylest_fit fits a model using a corpus of texts labeled by speaker.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
stylest_fit(
  x,
  speaker,
  terms = NULL,
  filter = NULL,
  smooth = 0.5,
  term_weights = NULL,
  fill_method = "value",
  fill_weight = 0,
  weight_varname = "mean_distance"
)

Arguments

x

Text vector. May be a corpus_frame object

speaker

Vector of speaker labels. Should be the same length as x

terms

If not NULL, terms to be used in the model. If NULL, use all terms

filter

If not NULL, a text filter to specify the tokenization. See corpus for more information about specifying filter

smooth

Numeric value used smooth term frequencies instead of the default of 0.5

term_weights

Dataframe of distances (or any weights) per word in the vocab. This dataframe should have one column $word and a second column $weight_var containing the weight for the word. See the vignette for details.

fill_method

if "value" (default), fill_weight is used to fill any terms with NA weight. If "mean", the mean term_weight should be used as the fill value

fill_weight

numeric value to fill in as weight for any term which does not have a weight specified in term_weights, default=0.0 (drops any words without weights)

weight_varname

Name of the column in term_weights containing the weights, default="mean_distance"

Details

The user may specify only one of terms or cutoff. If neither is specified, all terms will be used.

Value

A S3 stylest_model object containing: speakers Vector of unique speakers, filter text_filter used, terms terms used in fitting the model, ntoken Vector of number of tokens per speaker, smooth Smoothing value, weights If not NULL, a named matrix of weights for each term in the vocab, rate Matrix of speaker rates for each term in vocabulary

Examples

1
2

stylest documentation built on March 5, 2021, 1:05 a.m.