View source: R/dtm_functions.r
term_day_dist | R Documentation |
Calculate statistics for term occurence across days
term_day_dist(dtm, meta = NULL, date.var = "date")
dtm |
A quanteda dfm. Alternatively, a DocumentTermMatrix from the tm package can be used, but then the meta parameter needs to be specified manually |
meta |
If dtm is a quanteda dfm, docvars(meta) is used by default (meta is NULL) to obtain the meta data. Otherwise, the meta data.frame has to be given by the user, with the rows of the meta data.frame matching the rows of the dtm (i.e. each row is a document) |
date.var |
The name of the meta column specifying the document date. default is "date". The values should be of type POSIXlt or POSIXct |
A data.frame with statistics for each term.
freq: The number of times a term occurred
doc.freq: The number of documents in which a term occured
days.n: The number of days on which a term occured
days.pct: The percentage of days on which a term occured
days.entropy: The entropy of the distribution of term frequency across days
days.entropy.norm: The normalized days.entropy, where 1 is a discrete uniform distribution
tdd = term_day_dist(rnewsflow_dfm, date.var='date')
head(tdd)
tail(tdd)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.