serVis: View and/or share LDAvis in a browser

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/serVis.R

Description

View and/or share LDAvis in a browser.

Usage

1
2
serVis(json, out.dir = tempfile(), open.browser = interactive(),
  as.gist = FALSE, ...)

Arguments

json

character string output from createJSON.

out.dir

directory to store html/js/json files.

open.browser

Should R open a browser? If yes, this function will attempt to create a local file server via the servr package. This is necessary since the javascript needs to access local files and most browsers will not allow this.

as.gist

should the vis be uploaded as a gist? Will prompt for an interactive login if the GITHUB_PAT environment variable is not set. For more details, see https://github.com/ropensci/gistr#authentication.

...

arguments passed onto gistr::gist_create

Details

This function will place the necessary html/js/css files (located in system.file("htmljs", package = "LDAvis")) in a directory specified by out.dir, start a local file server in that directory (if necessary), and (optionally) open the default browser in this directory. If as.gist=TRUE, it will attempt to upload these files as a gist (in this case, please make sure you have the gistr package installed as well as your 'github.username' and 'github.password' set in options.)

Value

An invisible object.

Author(s)

Carson Sievert

See Also

createJSON

Examples

1
2
3
4
5
## Not run: 
# Use of serVis is documented here:
help(createJSON, package = "LDAvis")

## End(Not run)

Example output

createJSON               package:LDAvis                R Documentation

_C_r_e_a_t_e _t_h_e _J_S_O_N _o_b_j_e_c_t _t_o _r_e_a_d _i_n_t_o _t_h_e _j_a_v_a_s_c_r_i_p_t _v_i_s_u_a_l_i_z_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     This function creates the JSON object that feeds the visualization
     template. For a more detailed overview, see 'vignette("details",
     package = "LDAvis")'

_U_s_a_g_e:

     createJSON(phi = matrix(), theta = matrix(), doc.length = integer(),
       vocab = character(), term.frequency = integer(), R = 30,
       lambda.step = 0.01, mds.method = jsPCA, cluster, plot.opts = list(xlab =
       "PC1", ylab = "PC2"), ...)
     
_A_r_g_u_m_e_n_t_s:

     phi: matrix, with each row containing the distribution over terms
          for a topic, with as many rows as there are topics in the
          model, and as many columns as there are terms in the
          vocabulary.

   theta: matrix, with each row containing the probability distribution
          over topics for a document, with as many rows as there are
          documents in the corpus, and as many columns as there are
          topics in the model.

doc.length: integer vector containing the number of tokens in each
          document of the corpus.

   vocab: character vector of the terms in the vocabulary (in the same
          order as the columns of 'phi'). Each term must have at least
          one character.

term.frequency: integer vector containing the frequency of each term in
          the vocabulary.

       R: integer, the number of terms to display in the barcharts of
          the interactive viz. Default is 30. Recommended to be roughly
          between 10 and 50.

lambda.step: a value between 0 and 1. Determines the interstep distance
          in the grid of lambda values over which to iterate when
          computing relevance. Default is 0.01. Recommended to be
          between 0.01 and 0.1.

mds.method: a function that takes 'phi' as an input and outputs a K by
          2 data.frame (or matrix). The output approximates the
          distance between topics. See jsPCA for details on the default
          method.

 cluster: a cluster object created from the parallel package. If
          supplied, computations are performed using parLapply instead
          of lapply.

plot.opts: a named list used to customize various plot elements. By
          default, the x and y axes are labeled "PC1" and "PC2"
          (principal components 1 and 2), since jsPCA is the default
          scaling method.

     ...: not currently used.

_D_e_t_a_i_l_s:

     The function first computes the topic frequencies (across the
     whole corpus), and then it reorders the topics in decreasing order
     of frequency. The main computation is to loop through the topics
     and through the grid of lambda values (determined by
     'lambda.step') to compute the 'R' most _relevant_ terms for each
     topic and value of lambda.

_V_a_l_u_e:

     A string containing JSON content which can be written to a file or
     feed into serVis for easy viewing/sharing. One element of this
     string is the new ordering of the topics.

_R_e_f_e_r_e_n_c_e_s:

     Sievert, C. and Shirley, K. (2014) _LDAvis: A Method for
     Visualizing and Interpreting Topics_, ACL Workshop on Interactive
     Language Learning, Visualization, and Interfaces. <URL:
     http://nlp.stanford.edu/events/illvi2014/papers/sievert-illvi2014.pdf>

_S_e_e _A_l_s_o:

     serVis

_E_x_a_m_p_l_e_s:

     ## Not run:
     
     data(TwentyNewsgroups, package="LDAvis")
     # create the json object, start a local file server, open in default browser
     json <- with(TwentyNewsgroups,
                  createJSON(phi, theta, doc.length, vocab, term.frequency))
     serVis(json) # press ESC or Ctrl-C to kill
     
     # createJSON() reorders topics in decreasing order of term frequency
     RJSONIO::fromJSON(json)$topic.order
     
     # You may want to just write the JSON and other dependency files
     # to a folder named TwentyNewsgroups under the working directory
     serVis(json, out.dir = 'TwentyNewsgroups', open.browser = FALSE)
     # then you could use a server of your choice; for example,
     # open your terminal, type `cd TwentyNewsgroups && python -m SimpleHTTPServer`
     # then open http://localhost:8000 in your web browser
     
     # A different data set: the Jeopardy Questions+Answers data:
     # Install LDAvisData (the associated data package) if not already installed:
     # devtools::install_github("cpsievert/LDAvisData")
     library(LDAvisData)
     data(Jeopardy, package="LDAvisData")
     json <- with(Jeopardy,
                  createJSON(phi, theta, doc.length, vocab, term.frequency))
     serVis(json) # Check out Topic 22 (bodies of water!)
     
     # If you have a GitHub account, you can even publish as a gist
     # which allows you to easily share with others!
     serVis(json, as.gist = TRUE)
     
     # Run createJSON on a cluster of machines to speed it up
     system.time(
     json <- with(TwentyNewsgroups,
                  createJSON(phi, theta, doc.length, vocab, term.frequency))
     )
     #   user  system elapsed
     # 14.415   0.800  15.066
     library("parallel")
     cl <- makeCluster(detectCores() - 1)
     cl # socket cluster with 3 nodes on host 'localhost'
     system.time(
      json <- with(TwentyNewsgroups,
        createJSON(phi, theta, doc.length, vocab, term.frequency,
          cluster = cl))
     )
     #   user  system elapsed
     #  2.006   0.361   8.822
     
     # another scaling method (svd + tsne)
     library("tsne")
     svd_tsne <- function(x) tsne(svd(x)$u)
     json <- with(TwentyNewsgroups,
                  createJSON(phi, theta, doc.length, vocab, term.frequency,
                             mds.method = svd_tsne,
                             plot.opts = list(xlab="", ylab="")
                             )
                  )
     serVis(json) # Results in a different topic layout in the left panel
     ## End(Not run)
     

LDAvis documentation built on May 2, 2019, 10:59 a.m.