sm_text_tfidf: Construct the TF-IDF Matrix from Annotation or Data Frame
In statsmaths/smodels: Consistent format for fitting models

Given annotations, this function returns the term-frequency inverse document frequency (tf-idf) matrix from the extracted lemmas.

sm_text_tfidf(
  object,
  min_df = 0.1,
  max_df = 0.9,
  max_features = 10000,
  doc_var = "doc_id",
  token_var = "lemma",
  vocabulary = NULL
)

`object`	a data frame containing an identifier for the document (set with `doc_var`) and token (set with `token_var`)
`min_df`	the minimum proportion of documents a token should be in to be included in the vocabulary
`max_df`	the maximum proportion of documents a token should be in to be included in the vocabulary
`max_features`	the maximum number of tokens in the vocabulary
`doc_var`	character vector. The name of the column in `object` that contains the document ids. Defaults to "doc_id".
`token_var`	character vector. The name of the column in `object` that contains the tokens. Defaults to "lemma".
`vocabulary`	character vector. The vocabulary set to use in constructing the matrices. Will be computed within the function if set to `NULL`. When supplied, the options `min_df`, `max_df`, and `max_features` are ignored.