counts: Utilities for count matrices

Description Usage Arguments Value Author(s) Examples

Description

Tools for manipulating (sparse) count matrices.

Usage

1
2
normalize(x,byrow=TRUE)
stm_tfidf(x)

Arguments

x

A simple_triplet_matrix or matrix of counts.

byrow

Whether to normalize by row or column totals.

Value

normalize divides the counts by row or column totals, and stm_tfidf returns a matrix with entries x_{ij} \log[ n/(d_j+1) ], where x_{ij} is term-j frequency in document-i, and d_j is the number of documents containing term-j.

Author(s)

Matt Taddy [email protected]

Examples

1
2
3
4
5
normalize( matrix(1:9, ncol=3) )
normalize( matrix(1:9, ncol=3), byrow=FALSE )

(x <- matrix(rbinom(15,size=2,prob=.25),ncol=3))
stm_tfidf(x)

maptpx documentation built on May 30, 2017, 4:45 a.m.