Description Usage Arguments Details Value Author(s) See Also Examples
View and/or share LDAvis in a browser.
1 2 | serVis(json, out.dir = tempfile(), open.browser = interactive(),
as.gist = FALSE, ...)
|
json |
character string output from createJSON. |
out.dir |
directory to store html/js/json files. |
open.browser |
Should R open a browser? If yes, this function will attempt to create a local file server via the servr package. This is necessary since the javascript needs to access local files and most browsers will not allow this. |
as.gist |
should the vis be uploaded as a gist? Will prompt for an interactive login if the GITHUB_PAT environment variable is not set. For more details, see https://github.com/ropensci/gistr#authentication. |
... |
arguments passed onto |
This function will place the necessary html/js/css files (located in
system.file("htmljs", package = "LDAvis")
) in a directory specified
by out.dir
, start a local file server in that directory (if necessary),
and (optionally) open the default browser in this directory.
If as.gist=TRUE
, it will attempt to upload these files as a gist (in this
case, please make sure you have the gistr package installed as well as your
'github.username' and 'github.password' set in options.)
An invisible object.
Carson Sievert
createJSON
1 2 3 4 5 | ## Not run:
# Use of serVis is documented here:
help(createJSON, package = "LDAvis")
## End(Not run)
|
createJSON package:LDAvis R Documentation
_C_r_e_a_t_e _t_h_e _J_S_O_N _o_b_j_e_c_t _t_o _r_e_a_d _i_n_t_o _t_h_e _j_a_v_a_s_c_r_i_p_t _v_i_s_u_a_l_i_z_a_t_i_o_n
_D_e_s_c_r_i_p_t_i_o_n:
This function creates the JSON object that feeds the visualization
template. For a more detailed overview, see 'vignette("details",
package = "LDAvis")'
_U_s_a_g_e:
createJSON(phi = matrix(), theta = matrix(), doc.length = integer(),
vocab = character(), term.frequency = integer(), R = 30,
lambda.step = 0.01, mds.method = jsPCA, cluster, plot.opts = list(xlab =
"PC1", ylab = "PC2"), ...)
_A_r_g_u_m_e_n_t_s:
phi: matrix, with each row containing the distribution over terms
for a topic, with as many rows as there are topics in the
model, and as many columns as there are terms in the
vocabulary.
theta: matrix, with each row containing the probability distribution
over topics for a document, with as many rows as there are
documents in the corpus, and as many columns as there are
topics in the model.
doc.length: integer vector containing the number of tokens in each
document of the corpus.
vocab: character vector of the terms in the vocabulary (in the same
order as the columns of 'phi'). Each term must have at least
one character.
term.frequency: integer vector containing the frequency of each term in
the vocabulary.
R: integer, the number of terms to display in the barcharts of
the interactive viz. Default is 30. Recommended to be roughly
between 10 and 50.
lambda.step: a value between 0 and 1. Determines the interstep distance
in the grid of lambda values over which to iterate when
computing relevance. Default is 0.01. Recommended to be
between 0.01 and 0.1.
mds.method: a function that takes 'phi' as an input and outputs a K by
2 data.frame (or matrix). The output approximates the
distance between topics. See jsPCA for details on the default
method.
cluster: a cluster object created from the parallel package. If
supplied, computations are performed using parLapply instead
of lapply.
plot.opts: a named list used to customize various plot elements. By
default, the x and y axes are labeled "PC1" and "PC2"
(principal components 1 and 2), since jsPCA is the default
scaling method.
...: not currently used.
_D_e_t_a_i_l_s:
The function first computes the topic frequencies (across the
whole corpus), and then it reorders the topics in decreasing order
of frequency. The main computation is to loop through the topics
and through the grid of lambda values (determined by
'lambda.step') to compute the 'R' most _relevant_ terms for each
topic and value of lambda.
_V_a_l_u_e:
A string containing JSON content which can be written to a file or
feed into serVis for easy viewing/sharing. One element of this
string is the new ordering of the topics.
_R_e_f_e_r_e_n_c_e_s:
Sievert, C. and Shirley, K. (2014) _LDAvis: A Method for
Visualizing and Interpreting Topics_, ACL Workshop on Interactive
Language Learning, Visualization, and Interfaces. <URL:
http://nlp.stanford.edu/events/illvi2014/papers/sievert-illvi2014.pdf>
_S_e_e _A_l_s_o:
serVis
_E_x_a_m_p_l_e_s:
## Not run:
data(TwentyNewsgroups, package="LDAvis")
# create the json object, start a local file server, open in default browser
json <- with(TwentyNewsgroups,
createJSON(phi, theta, doc.length, vocab, term.frequency))
serVis(json) # press ESC or Ctrl-C to kill
# createJSON() reorders topics in decreasing order of term frequency
RJSONIO::fromJSON(json)$topic.order
# You may want to just write the JSON and other dependency files
# to a folder named TwentyNewsgroups under the working directory
serVis(json, out.dir = 'TwentyNewsgroups', open.browser = FALSE)
# then you could use a server of your choice; for example,
# open your terminal, type `cd TwentyNewsgroups && python -m SimpleHTTPServer`
# then open http://localhost:8000 in your web browser
# A different data set: the Jeopardy Questions+Answers data:
# Install LDAvisData (the associated data package) if not already installed:
# devtools::install_github("cpsievert/LDAvisData")
library(LDAvisData)
data(Jeopardy, package="LDAvisData")
json <- with(Jeopardy,
createJSON(phi, theta, doc.length, vocab, term.frequency))
serVis(json) # Check out Topic 22 (bodies of water!)
# If you have a GitHub account, you can even publish as a gist
# which allows you to easily share with others!
serVis(json, as.gist = TRUE)
# Run createJSON on a cluster of machines to speed it up
system.time(
json <- with(TwentyNewsgroups,
createJSON(phi, theta, doc.length, vocab, term.frequency))
)
# user system elapsed
# 14.415 0.800 15.066
library("parallel")
cl <- makeCluster(detectCores() - 1)
cl # socket cluster with 3 nodes on host 'localhost'
system.time(
json <- with(TwentyNewsgroups,
createJSON(phi, theta, doc.length, vocab, term.frequency,
cluster = cl))
)
# user system elapsed
# 2.006 0.361 8.822
# another scaling method (svd + tsne)
library("tsne")
svd_tsne <- function(x) tsne(svd(x)$u)
json <- with(TwentyNewsgroups,
createJSON(phi, theta, doc.length, vocab, term.frequency,
mds.method = svd_tsne,
plot.opts = list(xlab="", ylab="")
)
)
serVis(json) # Results in a different topic layout in the left panel
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.