get_cooccurrence: Get the co-occurrence of column elements per row entity (e.g....

View source: R/get_cooccurrence.R

get_cooccurrenceR Documentation

Get the co-occurrence of column elements per row entity (e.g. document co-occurrence of terms)

Description

Get the co-occurrence of column elements per row entity (e.g. document co-occurrence of terms)

Usage

get_cooccurrence(m, binarize = TRUE, threshold = 1)

Arguments

binarize

By default TRUE. Values larger than the set threshold are turned to 1, lower values are turned to 0.

threshold

Threshold for binarization. By default 1. If input is not an integer matrix of counts but, e.g., a probability matrix, a threshold such as 0.5 might be reasonable.

x

A matrix object (with a document term matrix containing integer counts in mind). Currently accepts base::matrix, Matrix::sparseMatrix or slam::simple_triplet_matrix.

Value

A sparseMatrix with the summed (document) co-occurrence per row of the specified column elements (words) of x.

Examples

mat <- cbind(A = c(2,1,1,0), B = c(2,0,1,0), C = c(0,1,1,0))
#      A B C D
# [1,] 2 2 0 0
# [2,] 1 0 1 0
# [3,] 1 1 1 0
# [4,] 0 0 0 0
get_cooccurrence(mat)
#   A B C
# A 3 2 2
# B 2 2 1
# C 2 1 2
get_cooccurrence(mat, binarize = FALSE)
#   A B C
# A 6 5 2  <- note the difference regarding A and B
# B 5 5 1
# C 2 1 2

manuelbickel/textility documentation built on Nov. 25, 2022, 9:07 p.m.