Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/sibp_exclusivity.R
sibp_exculsivity
calculates the coherence metric for an sibp
object fit on a training set. sibp_rank_runs
runs sibp_exclusivity
on each element in the list returned by sibp_param_search
, and ranks the parameter configurations from most to least promising.
1 2 | sibp_exclusivity(sibp.fit, X, num.words = 10)
sibp_rank_runs(sibp.search, X, num.words = 10)
|
sibp.fit |
A |
sibp.search |
A list of |
X |
The covariates for the full data set. The division between the training and test set is handled inside the function. |
num.words |
The top words whose coherence will be evaluated. |
The metric is formally described at the top of page 1605 of https://aclweb.org/anthology/P/P16/P16-1151.pdf. The purpose of this metric is merely to suggest which parameter configurations might contain the most interesting treatments to test if there are too many configurations to investigate manually. The choice of the parameter configuration should always be made on the basis of which treatments are substantively the most interesting, see sibp_top_words.
exclusivity |
An exclusivity matrix which quantifies the degree to which the top words in a treatment appear in documents that have that treatment but not in documents that lack that treatment. |
exclusivity_rank |
A table that ranks the treatments discovered by the various runs from sibp.search from most exclusive to least exclusive. |
Christian Fong
Fong, Christian and Justin Grimmer. 2016. “Discovery of Treatments from Text Corpora” Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. https://aclweb.org/anthology/P/P16/P16-1151.pdf
sibp_param_search, sibp_top_words
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ##Load the sample of Wikipedia biography data
data(BioSample)
# Divide into training and test sets
Y <- BioSample[,1]
X <- BioSample[,-1]
set.seed(1)
train.ind <- sample(1:nrow(X), size = 0.5*nrow(X), replace = FALSE)
# Search sIBP for several parameter configurations; fit each to the training set
sibp.search <- sibp_param_search(X, Y, K = 2, alphas = c(2,4),
sigmasq.ns = c(0.8, 1), iters = 1,
train.ind = train.ind)
# Get metric for evaluating most promising parameter configurations
sibp_rank_runs(sibp.search, X, 10)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.