View source: R/adjacency_functions.r
tokenWindowOccurence | R Documentation |
This function returns the occurence of tokens (position.matrix) and the window of occurence (window.matrix). This format enables the co-occurence of tokens within sliding windows (i.e. token distance) to be calculated by multiplying position.matrix with window.matrix.
tokenWindowOccurence(
tc,
feature,
context_level = c("document", "sentence"),
window.size = 10,
direction = "<>",
distance_as_value = F,
batch_rows = NULL,
drop_empty_terms = T
)
tc |
a tCorpus object |
feature |
The name of the feature column |
context_level |
Select whether to use "document" or "sentence" as context boundaries |
window.size |
The distance within which tokens should occur from each other to be counted as a co-occurence. |
direction |
a string indicating whether only the left ('<') or right ('>') side of the window, or both ('<>'), should be used. |
distance_as_value |
If True, the values of the matrix will represent the shorts distance to the occurence of a feature |
batch_rows |
Used in functions that call this function in batches |
drop_empty_terms |
If TRUE, emtpy terms (with zero occurence) will be dropped |
A list with two matrices. position.mat gives the specific position of a term, and window.mat gives the window in which each token occured. The rows represent the position of a term, and matches the input of this function (position, term and context). The columns represents terms.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.