plot.MultimodDiagnostic | R Documentation |
The plotting method for objects of the S3 class 'MultimodDiagnostic', which
are returned by the function multiSTM()
, which performs a battery of
tests aimed at assessing the stability of the local modes of an STM model.
## S3 method for class 'MultimodDiagnostic'
plot(x, ind = NULL, topics = NULL, ...)
x |
An object of S3 class 'MultimodDiagnostic'. See
|
ind |
An integer of list of integers specifying which plots to generate
(see details). If |
topics |
An integer or vector of integers specifying the topics for
which to plot the posterior distribution of covariate effect estimates. If
|
... |
Other arguments to be passed to the plotting functions. |
This methods generates a series of plots, which are indexed as follows. If a
subset of the plots is required, specify their indexes using the ind
argument. Please note that not all plot types are available for every object
of class 'MultimodDiagnostic':
Histogram of Expected
Common Words: Generates a 10-bin histogram of the column means of
obj$wmat
, a K-by-N matrix reporting the number of "top words" shared
by the reference model and the candidate model. The "top words" for a given
topic are defined as the 10 highest-frequency words.
Histogram of
Expected Common Documents: Generates a 10-bin histogram of the column means
of obj$tmat
, a K-by-N matrix reporting the number of "top documents"
shared by the reference model and the candidate model. The "top documents"
for a given topic are defined as the 10 documents in the reference corpus
with highest topical frequency.
Distribution of .95
Confidence-Interval Coverage for Regression Estimates: Generates a histogram
of obj$confidence.ratings
, a vector whose entries specify the
proportion of regression coefficient estimates in a candidate model that
fall within the .95 confidence interval for the corresponding estimate in
the reference model. This can only be generated if
obj$confidence.ratings
is non-NULL
.
Posterior
Distributions of Covariate Effect Estimates By Topic: Generates a square
matrix of plots, each depicting the posterior distribution of the regression
coefficients for the covariate specified in obj$reg.parameter.index
for one topic. The topics for which the plots are to be generated are
specified by the topics
argument. If the length of topics
is
not a perfect square, the plots matrix will include white space. The plots
have a dashed black vertical line at zero, and a continuous red vertical
line indicating the coefficient estimate in the reference model. This can
only be generated if obj$cov.effects
is non-NULL
.
Histogram of Expected L1-Distance From Reference Model: Generates a 10-bin
histogram of the column means of obj$lmat
, a K-by-N matrix reporting
the L1-distance of each topic from the corresponding one in the reference
model.
L1-distance vs. Top-10 Word Metric: Produces a smoothed color
density representation of the scatterplot of obj$lmat
and
obj$wmat
, the metrics for L1-distance and shared top-words, obtained
through a kernel density estimate. This can be used to validate the metrics
under consideration.
L1-distance vs. Top-10 Docs Metric: Produces a
smoothed color density representation of the scatterplot of obj$lmat
and obj$tmat
, the metrics for L1-distance and shared top-documents,
obtained through a kernel density estimate. This can be used to validate the
metrics under consideration.
Top-10 Words vs. Top-10 Docs Metric:
Produces a smoothed color density representation of the scatterplot of
obj$wmat
and obj$tmat
, the metrics for shared top-words and
shared top-documents, obtained through a kernel density estimate. This can
be used to validate the metrics under consideration.
Maximized Bound
vs. Aggregate Top-10 Words Metric: Generates a scatter plot with linear
trendline for the maximized bound vector (obj$lb
) and a linear
transformation of the top-words metric aggregated by model
(obj$wmod/1000
).
Maximized Bound vs. Aggregate Top-10 Docs
Metric: Generates a scatter plot with linear trendline for the maximized
bound vector (obj$lb
) and a linear transformation of the top-docs
metric aggregated by model (obj$tmod/1000
).
Maximized Bound
vs. Aggregate L1-Distance Metric: Generates a scatter plot with linear
trendline for the maximized bound vector (obj$lb
) and a linear
transformation of the L1-distance metric aggregated by model
(obj$tmod/1000
).
Top-10 Docs Metric vs. Semantic Coherence:
Generates a scatter plot with linear trendline for the reference-model
semantic coherence scores and the column means of object$tmat
.
L1-Distance Metric vs. Semantic Coherence: Generates a scatter plot with
linear trendline for the reference-model semantic coherence scores and the
column means of object$lmat
.
Top-10 Words Metric vs. Semantic
Coherence: Generates a scatter plot with linear trendline for the
reference-model semantic coherence scores and the column means of
object$wmat
.
Same as 5
, but using the limited-mass
L1-distance metric. Can only be generated if obj$mass.threshold != 1
.
Same as 11
, but using the limited-mass L1-distance metric. Can
only be generated if obj$mass.threshold != 1
.
Same as
7
, but using the limited-mass L1-distance metric. Can only be
generated if obj$mass.threshold != 1
.
Same as 13
, but
using the limited-mass L1-distance metric. Can only be generated if
obj$mass.threshold != 1
.
Brandon M. Stewart (Princeton University) and Antonio Coppola (Harvard University)
Roberts, M., Stewart, B., & Tingley, D. (Forthcoming). "Navigating the Local Modes of Big Data: The Case of Topic Models. In Data Analytics in Social Science, Government, and Industry." New York: Cambridge University Press.
multiSTM
## Not run:
# Example using Gadarian data
temp<-textProcessor(documents=gadarian$open.ended.response,
metadata=gadarian)
meta<-temp$meta
vocab<-temp$vocab
docs<-temp$documents
out <- prepDocuments(docs, vocab, meta)
docs<-out$documents
vocab<-out$vocab
meta <-out$meta
set.seed(02138)
mod.out <- selectModel(docs, vocab, K=3,
prevalence=~treatment + s(pid_rep),
data=meta, runs=20)
out <- multiSTM(mod.out, mass.threshold = .75,
reg.formula = ~ treatment,
metadata = gadarian)
plot(out)
plot(out, 1)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.