preprocessing_choice_regression: Preprocessing Choice Regressions

Description Usage Arguments Value Examples

View source: R/preprocessing_choice_regression.R

Description

Assessing the effects of preprocessing decisions on an outcome variable.

Usage

1
2
3
4
5
6
preprocessing_choice_regression(
  Y,
  choices,
  dataset = "UK",
  base_case_index = 128
)

Arguments

Y

A vector of length 128 (usually) containing a numeric outcome variable. This should be the preText (or other) score for a particular preprocessing specification.

choices

A 128 x 7 data.frame produced by the 'factorial_preprocessing()' function and output in the '$choices' field.

dataset

The name to be given to the data we are analyzing.

base_case_index

An optional argument which removes a base case row from the choices data before performing the regression.

Value

A data.frame

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
## Not run: 
# *** note that this function is already called in the preText() function and
# its output is returned in the results.
# load the package
library(preText)
# load in the data
data("UK_Manifestos")
# preprocess data
preprocessed_documents <- factorial_preprocessing(
    UK_Manifestos,
    use_ngrams = TRUE,
    infrequent_term_threshold = 0.02,
    verbose = TRUE)
# run preText
preText_results <- preText(
    preprocessed_documents,
    dataset_name = "Inaugural Speeches",
    distance_method = "cosine",
    num_comparisons = 100,
    verbose = TRUE)
# get regression results
reg_results <- preprocessing_choice_regression(
     preText_results$preText_scores$preText_score,
     preprocessed_documents$choices,
     dataset = "UK Manifestos",
     base_case_index = 128)

## End(Not run)

matthewjdenny/preptest documentation built on July 27, 2021, 1:19 a.m.