wordWindowOccurence: Gives the window in which a term occured in a matrix.

Description Usage Arguments Value

Description

This function returns the occurence of words (position.matrix) and the window of occurence (window.matrix). This format enables the co-occurence of words within sliding windows (i.e. word distance) to be calculated by multiplying position.matrix with window.matrix.

Usage

1
2
3
4
wordWindowOccurence(tokenlist, window.size = 3, direction = "<>",
  distance.as.value = F, doc.col = getOption("doc.col", "doc_id"),
  position.col = getOption("position.col", "position"),
  word.col = getOption("word.col", "word"))

Arguments

window.size

The distance within which words should occur from each other to be counted as a co-occurence.

direction

a string indicating whether only the left ('<') or right ('>') side of the window, or both ('<>'), should be used.

position

An integer vector giving the position of terms in a given context (e.g., document, paragraph, sentence)

term

A character vector giving the terms

context

A vector giving the context in which terms occur (e.g., document, paragraph, sentence)

Value

A list with two matrices. position.mat gives the specific position of a term, and window.mat gives the window in which each word occured. The rows represent the position of a term, and matches the input of this function (position, term and context). The columns represents terms.


kasperwelbers/semnet documentation built on May 20, 2019, 7:38 a.m.