Description Usage Arguments Value Examples
This function is used in variable setup functions for the cox regression model. It separates the full data into data frames of answered and unanswered questions. It then uses the get_freq_terms function from this package to get data frames of the most commonly used words in the user-specified text variable of answered and unanswered questions (the fitted model used question titles). The resulting data frames are then joined by word.
1 | get_au_terms(data, variable, stopwords = NULL, remove = NULL)
|
data |
The full data set |
variable |
The variable to get the frequent terms from. In the model, question titles were used. Argument should be input as a string. |
stopwords |
Optional, add stopwords to remove. Argument should be input in the form of a string or character vector. For the model, "can", "will", "cant", "wont", "works", "get", "help", "need", "fix", "doesnt", "dont" were removed. |
remove |
Optional, add words to remove from the resulting data frame. Argument should be input in the form of a string or character vector. For the model, words that matched with any of the category, subcategory, or new_category levels were removed. |
Returns a data frame of words from the input text variable, along with the frequency each word occurs in all of the data, as well as in answered and unanswered questions, and a ratio calculated as: frequency in answered divided by frequency in unanswered. The resulting data frame is used in exploratory_setup and variable_setup functions for the contain_answered and contain_unanswered variables.
1 2 3 | words <- c("can", "will", "cant", "wont", "works", "get", "help", "need", "fix", "doesnt", "dont")
devices <- c("iphone", "macbook", "imac", "ipad")
get_au_terms(data = x, variable = "title", stopwords = words, remove = devices)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.