test_language_model: Test Language Model

Description Usage Arguments Details Value See Also Examples

View source: R/test_language_model.R

Description

This function tests a model created by the language_model function on a new dataset

Usage

1
2
3
4
5
6
7
8
9
test_language_model(
  input,
  outcome,
  text,
  trainedModel,
  ngrams = "1",
  dfmWeightScheme = "count",
  progressBar = TRUE
)

Arguments

input

A dataframe containing a column with text data (character strings) and an outcome variable (numeric or two-level factor)

outcome

A string consisting of the column name for the outcome variable in inputDataframe

text

A string consisting of the column name for the text data in inputDataframe

trainedModel

A trained model created by the language_model function

ngrams

A string defining the ngrams to serve as predictors in the model. Defaults to "1". For more information, see the okens_ngrams function in the quanteda package

dfmWeightScheme

A string defining the weight scheme you wish to use for constructing a document-frequency matrix. Default is "count". For more information, see the dfm_weight function in the quanteda package

progressBar

Show a progress bar. Defaults to TRUE.

Details

This function is effectively a special version of the language_model function. Instead of creating a new model, the outputs are based on the results of testing a new, independent dataset using an existing model. This allows for assessing how well a trained language model generalizes to other inputs - this function allows for comparisons between the models using many of the same functions that can be used with language_model.

Value

An object of the type "testAssessment"

See Also

language_model

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
movie_review_data1$cleanText = clean_text(movie_review_data1$text)
movie_review_data2$cleanText = clean_text(movie_review_data2$text)

# Train a model on the \code{movie_review_data1} dataset
# Using language to predict "Positive" vs. "Negative" reviews
movie_model_valence = language_model(movie_review_data1,
                                     outcome = "valence",
                                     outcomeType = "binary",
                                     text = "cleanText")

# Test the model on the \code{movie_review_data2} dataset
movie_model_valence_test = test_language_model(movie_review_data2,
                                    outcome = "valence",
                                    text = "cleanText",
                                    trainedModel = movie_model_valence)
summary(movie_model_valence_test)

## End(Not run)

nlanderson9/languagePredictR documentation built on June 10, 2021, 11 a.m.