permutationTest: Permutation test of a binary covariate.
In stm: Estimation of the Structural Topic Model

permutationTest

R Documentation

Permutation test of a binary covariate.

Description

Run a permutation test where a binary treatment variable is randomly permuted and topic model is reestimated.

Usage

permutationTest(
  formula,
  stmobj,
  treatment,
  nruns = 100,
  documents,
  vocab,
  data,
  seed = NULL,
  stmverbose = TRUE,
  uncertainty = "Global"
)

Arguments

`formula`	A formula for the prevalence component of the `stm` model and the `estimateEffect` call. This formula must contain at least one binary covariate (specified through the argument `treatment`) but it can contain other terms as well. If the binary covariate is interacted with additional variables the estimated quantity of interest is the effect when those additional variables are set to 0.
`stmobj`	Model output from a single run of `stm` which contains the reference effect.
`treatment`	A character string containing treatment id as used in the formula of the stmobj. This is the variable which is randomly permuted.
`nruns`	Number of total models to fit (including the original model).
`documents`	The documents used in the stmobj model.
`vocab`	The vocab used in the stmobj model.
`data`	The data used in the stmobj model.
`seed`	Optionally a seed with which to replicate the result. As in `stm` the seed is automatically saved and returned as part of the object. Passing the seed here will replicate the previous run.
`stmverbose`	Should the stm model be run with `verbose=TRUE`. Turning this to `FALSE` will suppress only the model specific printing. An update on which model is being run will still print to the screen.
`uncertainty`	Which procedure should be used to approximate the measurement uncertainty in the topic proportions. See details for more information. Defaults to the Global approximation.

Details

This function takes a single binary covariate and runs a permutation test where, rather than using the true assignment, the covariate is randomly drawn with probability equal to its empirical probability in the data. After each shuffle of the covariate the same STM model is estimated at different starting values using the same initialization procedure as the original model, and the effect of the covariate across topics is calculated.

Next the function records two quantities of interest across this set of "runs" of the model. The first records the absolute maximum effect of the permuted covariate across all topics.

The second records the effect of the (permuted) covariate on the topic in each additional stm run which is estimated to be the topic closest to the topic of interest (specified in plot.STMpermute) from the original stm model. Uncertainty can be calculated using the standard options in estimateEffect.

Value

`ref`	A list of K elements containing the quantiles of the estimated effect for the reference model.
`permute`	A list where each element is an aligned model parameter summary
`variable`	The variable id that was permuted.
`seed`	The seed for the stm model.

Examples


## Not run: 
temp<-textProcessor(documents=gadarian$open.ended.response,metadata=gadarian)
out <- prepDocuments(temp$documents, temp$vocab, temp$meta)
documents <- out$documents
vocab <- out$vocab
meta <- out$meta
set.seed(02138)
mod.out <- stm(documents, vocab, 3, prevalence=~treatment + s(pid_rep), data=meta)
summary(mod.out)
prep <- estimateEffect(1:3 ~ treatment + s(pid_rep), mod.out, meta)
plot(prep, "treatment", model=mod.out,
     method="difference",cov.value1=1,cov.value2=0)
test <- permutationTest(formula=~ treatment + s(pid_rep), stmobj=mod.out, 
                        treatment="treatment", nruns=25, documents=documents,
                        vocab=vocab,data=meta, stmverbose=FALSE)
plot(test,2, xlab="Effect", ylab="Model Index", main="Topic 2 Placebo Test")

## End(Not run)

stm documentation built on June 24, 2024, 5:18 p.m.