export_browser_data | R Documentation |
Transform and save modeling results in a format suitable for use by
dfr-browser, the web-browser
based model browser. For a quick export and immediate viewing, see also
dfr_browser
.
export_browser_data( m, out_dir, zipped = TRUE, n_top_words = 50, n_scaled_words = 1000, supporting_files = FALSE, overwrite = FALSE, internalize = FALSE, info = NULL, proper = FALSE, digits = getOption("digits"), permute = NULL, metadata_header = FALSE )
m |
|
out_dir |
directory for output. If |
zipped |
should the larger data files be zipped? (If TRUE, uses
|
n_top_words |
how many top words per topic to save? |
n_scaled_words |
how many word types to use in scaled coordinates calculation? |
supporting_files |
if TRUE (FALSE is default), all the files
needed to run the browser are copied to |
overwrite |
if TRUE, this will clobber existing files |
internalize |
always set to FALSE. If TRUE, model data is in the browser home page rather than separate files, but this behavior is deprecated. See Details. |
info |
a list of dfr-browser parameters. Converted to JSON with
|
proper |
if TRUE, the document-topic and topic-word matrices will be smoothed by the hyperparameters alpha and beta (respectively) and normalized before export, instead of the "raw" sampling weights (which is the default). For MALLET models, moothed and normalized weights then give the maximum a posteriori estimates of the corresponding probabilities, which is "properly" what the modeling process yields (but may disguise the effects of variations in document length—and increase the storage space required). |
digits |
if |
permute |
if non-NULL, specifies a renumbering of the topics: the new
topic |
metadata_header |
if TRUE (FALSE is default), the exported metadata CSV will have a header row (not expected by dfr-browser by default) |
This routine reports on its progress. By default, it saves zipped versions of
the document-topics matrix and metadata files; dfr-browser supports
client-side unzipping. This function compresses files using R's
zip
command. If that fails, set zipped=F
(and, if you
wish, zip the files using another program).
A detailed description of the output files can be found in the dfr-browser technical notes at http://github.com/agoldst/dfr-browser.
This package includes a copy of the dfr-browser files necessary to run the
browser. By default, this routine only exports data files. To also copy
over the dfr-browser source (javascript, HTML, and CSS), pass
supporting_files=T
.
If you are working with non-JSTOR documents, the one file that will reflect
this is the exported metadata. dfr-browser expects seven metadata columns by default:
id,title,author,journaltitle,volume,issue,pubdate,pagerange
. This
function looks for these seven columns and, if it finds them, writes the
metadata with these columns in this order. Any remaining columns are pushed
all the way to the right of the output. (dfr-browser ignores them unless you
customize it.) If any these columns is not present in metadata(m)
,
then export_browser_data
will simply save all the metadata as is,
adjusting only the CSV format to match the baseline expectation of
dfr-browser (namely, a headerless CSV conforming to
RFC 4180.).
If your metadata does not match these expectations, an alternative is to set
dfr-browser's configuration parameters VIS.metadata.type
and
VIS.bib.type
to "base" (using the info
parameter) and to write
out a metadata file with a header by passing metadata_header=T
to this function or dfr_browser
. For polished results more
customization of dfr-browser might be necessary.
Note that you can adjust the metadata held on the model object by assigning
to metadata(m)
before exporting the browser data. In particular, if
you have many documents, you may wish to conserve space by eliminating
metadata columns that are not used by the visualization: for example,
metadata(m)$publisher <- NULL
. Earlier versions of dfrtopics tried to
eliminate such columns automatically, but this more conservative approach
aims to allow you more flexibility about what gets exported.
To insert the data directly into the main index.html
file, pass
internalize=T
. This behavior is now deprecated and will be removed
in a future version.
dfr_browser
, model_dfr_documents
,
train_model
, topic_scaled_2d
, and the functions
for outputting individual custom files:
export_browser_topic_words
,
export_browser_doc_topics
,
export_browser_metadata
,
export_browser_topic_scaled
,
export_browser_info
.
## Not run: m <- model_dfr_documents("citations.CSV", "wordcounts", "stoplist.txt", n_topics=40) # export all files needed for browser program export_browser_data(m, out_dir="browser", supporting_files=T) # or: overwrite model data only for an already-existing browser export_browser_data(m, out_dir="browser/data", supporting_files=F, overwrite=T) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.