create_hash_function: Hash Trick

Description Usage Arguments Examples

View source: R/features.R

Description

Replace terms with their hashed valued using a hash function that outputs integers from 1 to N.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
create_hash_function(cardinality = 100L)

hash(text, hash_func)

## S3 method for class 'character'
hash(text, hash_func)

## S3 method for class 'document'
hash(text, hash_func = NULL)

## S3 method for class 'documents'
hash(text, hash_func = NULL)

## S3 method for class 'dtm'
hash(text, hash_func = NULL)

## S3 method for class 'corpus'
hash(text, hash_func = NULL)

Arguments

cardinality

Max index used for hashing (default 100).

text

A corpus, character string, a document, or a document_term_matrix.

hash_func

A hash function as returned by create_hash_function.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## Not run: 
init_textanalysis()

hash_func <- create_hash_function(10L)
hash("a", hash_func)

doc <- string_document("A simple document.")
hash(doc, hash_func)

## End(Not run)

news-r/textanalysis documentation built on Nov. 4, 2019, 9:40 p.m.