Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/sibp_top_words.R
sibp_top_words
returns a data frame of the words most associated with each treatment.
1 | sibp_top_words(sibp.fit, words, num.words = 10, verbose = FALSE)
|
sibp.fit |
A |
words |
The actual words, usually obtained through colnames(X). |
num.words |
The number of top words to report. |
verbose |
If set to true, reports how common each treatment is (so that the analyst can focus on the common treatments) and how closely associated each word is with each treatment. |
The choice of the parameter configuration should always be made on the basis of which treatments are substantively the most interesting. This function provides one natural way of discovering which words are most associated with each treatment (the mean parameter for the posterior distribution of phi, where phi is the effect of the treatment on the count of word w) and therefore helps to determine which treatments are most interesting.
top.words |
A data frame where each column consists of the top ten words (in order) associated with a given treatment. |
Christian Fong
Fong, Christian and Justin Grimmer. 2016. “Discovery of Treatments from Text Corpora” Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. https://aclweb.org/anthology/P/P16/P16-1151.pdf
sibp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | ##Load the Wikipedia biography data
data(BioSample)
# Divide into training and test sets
Y <- BioSample[,1]
X <- BioSample[,-1]
set.seed(1)
train.ind <- sample(1:nrow(X), size = 0.5*nrow(X), replace = FALSE)
# Fit an sIBP on the training data
sibp.fit <- sibp(X, Y, K = 2, alpha = 4, sigmasq.n = 0.8,
train.ind = train.ind)
sibp_top_words(sibp.fit, colnames(X))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.